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General  Information 

Security  in  computer  systems  and  networks  is  emerging  as  one  of  the  most  challenging 
research  areas  for  the  future.  The  main  aim  of  the  school  is  to  offer  a  good  spectrum  of 
current  research  in  foundations  of  security,  ranging  from  programming  languages  to 
analysis  of  protocols,  that  can  be  of  help  for  graduate  students,  young  researchers  from 
academia  or  industry  that  intend  to  approach  the  field.  As  for  the  previous  edition 
(FOSAD’OOL  the  school  covers  two  weeks  (from  Monday  17  to  Saturday  29,  September 
2001)  and  alternates  four  lecturers  per  week  on  monographic  courses  of  about  6/8 
hours  each.  Saturdays  are  reserved  for  presentations  given  by  those  participants  that 
intend  to  take  advantage  of  the  audience  for  discussing  their  current  research  in  the 
area.  The  school  is  organised  at  the  Centro  Residenziale  Universitario  of  the  University 
of  Bologna,  situated  in  Bertinoro,  a  small  village  on  a  scenic  hill  with  a  wonderful 
panorama,  in  between  Forli  and  Cesena  (about  50  miles  south-east  of  Bologna,  15  miles 
to  the  Adriatic  sea). 
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*  Laboratory). 

Security  for  Multimedia  Traffic  over  IP  -  Rosario  Gennaro  (IBM,  T.J. 
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Validating  Firewalls  using  Flow  Logics 
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Abstract.  The  ambient  calculus  is  a  calculus  of  computation  that  allows 
active  processes  to  communicate  and  to  move  between  sites.  A  site  is  said 
to  be  a  protective  firewall  whenever  it  denies  entry  to  all  attackers  not  pos¬ 
sessing  the  required  passwords.  We  devise  a  computationally  sound  test  for 
validating  the  protectiveness  of  a  proposed  firewall  and  show  how  to  perform 
the  test  in  polynomial  time. 

The  first  step  is  the  definition  of  a  flow  logic  for  analysing  the  flow  of  control 
in  mobile  ambients;  it  amounts  to  a  syntax-directed  specification  of  the  ac¬ 
ceptability  of  a  control  flow  estimate.  The  second  step  is  to  define  a  hardest 
attacker  and  to  determine  whether  or  not  there  exists  a  control  flow  esti¬ 
mate  that  shows  the  inability  of  the  hardest  attacker  to  enter;  if  such  an 
estimate  exists,  then  none  of  the  infinitely  many  attackers  can  enter  unless 
they  contain  at  least  one  of  the  passwords,  and  consequently  the  firewall 
cannot  contain  any  trap  doors. 

Keywords:  Hardest  attackers,  static  analysis,  control-flow  analysis,  flow 
logic,  mobile  ambients,  firewalls. 


1  Introduction 


The  ambient  calculus  is  a  calculus  of  computation  that  is  based  on  traditional  pro¬ 
cess  algebras  (such  as  the  7r-calculus  [15]).  The  main  focus  is  not  on  communication, 
however,  but  on  the  ability  of  active  processes  to  move  between  sites  representing 
administrative  domains;  the  calculus  thereby  extends  the  notion  of  mobility  found 
in  Java  [13]  where  only  passive  code  can  be  moved  between  sites.  Both  processes 
and  sites  are  modelled  as  ambients;  their  ability  to  move  around  is  governed  by 
the  capabilities  possessed.  The  calculus  was  introduced  in  [7]  and  has  been  studied 
extensively  [6, 8,9, 14, 18, 19,22].  We  refer  to  Section  2  for  a  review  of  the  ambient 
calculus. 

Since  processes  may  evolve  when  moving  around,  the  structure  of  a  system  of  ambi¬ 
ents  is  very  dynamic.  In  Section  3  we  therefore  develop  a  control  flow  analysis  [17] 
for  predicting  the  set  of  processes  that  may  turn  up  inside  a  given  ambient.  This 
takes  the  form  of  defining  a  flow  logic  [16]  for  checking  whether  or  not  a  control  flow 
estimate  (as  might  have  been  produced  by  a  control  flow  analysis)  is  indeed  accept¬ 
able;  in  the  absence  of  higher-order  features  in  the  ambient  calculus,  this  amounts 
to  a  syntax-directed  definition  of  a  number  of  judgements.  The  analysis  combines 
the  ability  to  handle  communication  (in  the  manner  of  analyses  for  the  7r-calculus 
[2,3])  with  the  ability  to  handle  movement  (in  the  manner  of  an  analysis  for  the 
communication- free  fragment  [18]).  Semantic  correctness  is  established  by  proving 
that  all  acceptable  analyses  are  semantically  sound  (by  means  of  a  subject-reduction 
result  in  the  manner  of  type  systems). 


On  the  algorithmic  side  we  show  that  there  always  is  a  least  control  flow  estimate 
and  that  it  can  be  computed  in  polynomial  time;  this  takes  the  form  of  generating 
a  set  of  constraints  that  is  then  solved  by  a  worklist  algorithm  [17]. 

In  [7]  the  communication-free  fragment  of  the  ambient  calculus  is  used  to  model  and 
study  a  firewall  where  only  agents  knowing  the  required  passwords  are  supposed  to 
enter;  indeed,  assuming  fairness,  it  is  shown  that  all  agents  making  correct  use  of 
all  the  passwords  will  in  fact  enter.  However,  in  the  interest  of  security  and  safety 
of  systems,  it  is  at  least  as  important  to  ensure  that  an  attacker  not  knowing  any 
of  the  passwords  cannot  possibly  enter;  we  shall  say  that  the  firewall  is  protective 
when  this  is  the  case.  As  an  example,  a  protective  firewall  cannot  contain  trap  doors 
or  other  ways  of  circumventing  the  protection  offered  by  the  passwords. 

The  difficulty  of  course  is,  that  there  are  an  infinity  of  attackers  that  do  not  know 
the  passwords,  and  that  it  seems  infeasible  (and  indeed  undecidable)  to  perform 
automatic  tests  that  will  guard  against  all  of  these.  To  overcome  these  problems  w€^  *  • 
change  in  Section  4  the  “level  of  granularity”  of  our  observations  to  coincide  with 
those  of  the  control  flow  analysis.  We  then  prove  that  there  is  a  process,  called  a 
hardest  attacker,  such  that: 

If  there  exists  a  control  flow  estimate  that  shows  the  inability  of  the  hardest 
attacker  to  enter,  then  none  of  the  infinitely  many  attackers  can  enter  unless 
they  contain  at  least  one  of  the  passwords. 

The  ability  to  identify  hardest  attackers  is  perhaps  comparable  to  the  ability  to 
identify  hardest  problems  in  given  complexity  classes.  To  argue  the  case  in  less 
technical  jargon,  consider  the  following  “folk  theorem”: 

Testing1  can  prove  the  presence  of  bugs  but  never  their  absence. 

Unfortunately,  this  has  lead  to  the  wide-spread  belief  that  no  experimentation  with 
software  can  be  used  for  formally  validating  software.  The  technical  results  presented 
here,  generalising  those  of  [18],  provide  a  concrete  instance  of  the  rather  different, 
and  more  useful,  insight: 

Testing2  can  prove  the  absence  of  bugs  but  never  their  presence. 

Expanding  the  area  of  applicability  of  this  insight  will  likely  lead  to  fundamental 
changes  in  the  validation  of  software  used  in  security  oriented  applications.  The 
ability  to  extend  the  analysis  to  the  existing  software  base,  perhaps  involving  legacy 
code  of  “unknown”  origin,  offers  a  level  of  guarantee  well  above  that  of  other  formal 
approaches. 


2  Mobile  Ambients 

Syntax .  The  presentation  of  the  ambient  calculus  as  given  in  [7]  actually  defines  a 
“pre-syntax”  for  ambients.  One  aspect  of  this  is  that  not  all  the  defined  ambient 
expressions  are  meaningful  and  hence  a  type  system  [8]  is  needed  to  rule  out  the 
undesired  elements;  the  other  aspect  is  that  some  of  the  clarifications  made  in  the 
type  system  could  in  fact  equally  well  be  performed  in  the  abstract  syntax.  To 

1  Testing  in  the  sense  of  dynamically  running  a  program  on  a  number  of  inputs. 

2  Testing  in  the  sense  of  statically  analysing  a  program  on  a  number  of  inputs. 
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Table  1.  Abstract  syntax. 


avoid  the  artificial  problem  of  devising  semantics  and  static  analysis  for  blatantly 
meaningless  expressions  we  shall  use  a  slightly  more  refined  syntax  that  makes  some 
of  the  distinctions  of  the  type  system. 

The  syntax  of  ambients  in  Table  1  is  built  around  three  syntactic  categories:  a 
class  of  processes,  ranged  over  by  P  £  Proc,  a  class  of  capabilities,  ranged  over 
by  M  £  Cap,  and  a  class  of  namings,  ranged  over  by  N  £  Nam.  We  follow 
[7]  in  distinguishing  between  names  (introduced  by  the  restriction  operator  known 
from  process  algebras)  and  variables  (introduced  by  input  statements);  we  also 
distinguish  between  variables  used  for  holding  capabilities,  ranged  over  by  a;  £  Varc, 
and  variables  used  for  holding  names,  ranged  over  by  u  £  Varn.  We  explain  the 
constructs  below. 

First  we  consider  processes  (ignoring  the  superscript  annotations  in  Table  1).  Bor¬ 
rowing  from  the  7r-calculus  [15]  local  scope  is  managed  using  the  restriction  operator. 
Also  there  is  the  inactive  process,  the  parallel  composition  of  two  processes  and  a 
replicated  process  that  is  allowed  to  unfold  to  arbitrarily  many  ( “infintely  many” ) 
copies  of  the  process.  The  next  two  constructs  are  unique  to  the  ambient  calculus. 
An  ambient  is  a  process  operating  inside  a  named  border.  Movement  of  ambients  is 
governed  by  capabilities  (explained  below)  and  includes  the  ability  for  an  ambient 
to  move  out  of  an  enclosing  ambient  and  for  an  ambient  to  move  into  a  sibling  am¬ 
bient.  Output  of  capabilities  and  names  is  much  as  in  the  7r-calculus  except  that  the 
channel  is  implicit  and  embedded  in  the  enclosing  ambient  itself.  Similarly  for  input 
of  capabilities  and  names.  As  usual,  trailing  inactive  actions  will  often  be  omitted 
from  examples. 

Next  we  consider  capabilities  and  namings  (once  more  ignoring  the  superscript 
annotations  in  Table  1).  The  in-capability  directs  the  enclosing  ambient  to  enter  a 
sibling  named  N ;  this  is  illustrated  in  Figure  1  and  will  be  explained  in  detail  when 
we  consider  the  semantics  below.  The  out-capability  directs  the  enclosing  ambient 
to  move  out  of  its  parent  named  N.  The  open-capability  dissolves  the  border  around 
a  sibling  ambient  named  N .  Since  capabilities  can  be  communicated  we  also  need 
variables  ranging  over  them.  Capabilities  include  the  null  capability  as  well  as  the 
sequential  composition  of  capabilities  describing  a  path  to  the  desired  destination. 
Namings  are  names  but  since  they  can  be  communicated  we  also  need  variables 
ranging  over  them. 

Annotations .  Let  us  now  return  to  the  two  kinds  of  superscript  annotations  used 
in  the  syntax  of  Table  1.  One  class  of  annotations  is  composed  of  the  stable  names , 
ranged  over  by  /i  £  SNam,  for  names  occurring  in  restrictions,  and  the  binders , 
ranged  over  by  /3C  £  Bndc  and  (3n  £  Bndn,  for  variables  occurring  in  input  actions; 
we  occasionally  use  (5  £  Bnd  =  Bndc  U  Bndn.  These  annotations  are  necessary 
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Fig.  1.  Pictorial  representation  of  the  basic  reduction  rules. 


for  our  analysis  because  the  semantics  of  the  ambient  calculus  borrows  from  the 
7r-calculus  in  allowing  a-conversion  of  bound  names  and  variables.  As  an  example, 
suppose  we  consider  the  ambient  system 

(v  n){v  m)n[m[  0]] 

and  that  our  control  flow  estimate  correctly  says  that  m  occurs  inside  n  but  not  vice 
versa;  unfortunately,  this  makes  little  sense  since  a-conversion  allows  us  to  change 
the  above  ambient  system  to 

n)m  [n[0|] 

and  now  the  control  flow  estimate  is  no  longer  correct.  To  circumvent  this  problem 
we  ensure  that  control  flow  estimates  always  refer  to  stable  names  and  binders. 
Continuing  the  example,  when  we  consider 

(i/nN)(i/mM)n[m[0]] 

we  say  that  M  occurs  inside  N;  this  then  remains  correct  for  the  a-converted  system 

(i/mN)(i/nM)m[n[0]] 

since  stable  names  are  never  changed  by  a-conversion.  Indeed,  one  way  to  under¬ 
stand  the  distinction  between  names  and  stable  names  is  to  regard  the  stable  names 
as  static  representations  of  the  names  arising  dynamically.  The  considerations  for 
variables  and  binders  are  analogous. 

The  other  class  of  annotations  used  in  Table  1  are  the  labels.  They  assist  in  devel¬ 
oping  the  control  flow  analysis  by  being  able  to  precisely  pin-point  program  points 
inside  the  ambient  system;  for  this  purpose  it  would  be  natural  to  let  all  labels  be 
distinct  since  indeed  all  program  points  are  distinct.  As  an  example,  in  a  system 
like 

|  m2[outn]] 

this  will  allow  us  to  distinguish  the  two  occurrences  of  an  ambient  named  m.  As  we 
shall  see,  labels  allow  us  to  control  the  complexity  (and  precision)  of  the  analysis  by 
treating  one  or  more  program  points  alike;  for  this  purpose  it  may  be  appropriate 
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only  to  use  a  few  labels.  Indeed,  one  way  to  understand  the  use  of  labels  is  to  regard 
labels  as  amalgamations  of  a  number  of  program  points. 

We  use  l  G  Lab  to  range  over  the  set  of  labels.  More  specifically,  we  view  labels  as 
being  defined  by 

Lab  =  Laba  U  Lab*  U  Lab°  U  Labp  U  Labc  U  Lab71 

and  use  la  G  Laba  to  annotate  ambients,  l%  G  Lab*  to  annotate  in-capabilities, 
1°  G  Lab°  to  annotate  out-capabilities,  lp  G  Labp  to  annotate  open-capabilities, 
lc  £  Labc  to  annotate  the  output  of  capabilities,  and  ln  G  Labn  to  annotate  the 
output  of  names. 

We  shall  assume  that  the  sets  of  labels,  Lab,  stable  names,  SNam,  binders  for  the 
input  of  capabilities,  Bndc,  and  binders  for  the  input  of  names,  Rnd71,  are  pairwise 
disjoint.  It  may  aid  the  intution  to  assume  that  the  different  sets  of  labels,  listed 
above,  are  also  pairwise  disjoint.  Finally,  we  assume  that  all  the  sets  mentioned  are 
non-empty;  for  any  given  program  they  can  always  be  assumed  to  be  finite  since 
the  semantics  below  does  not  create  new  labels,  stable  names  or  binders. 

We  write  fn(P)  for  the  set  of  free  names  of  P  and  similarly  for  M  and  N.  Similarly 
we  write  fy(P)  for  the  set  of  free  variables  of  P  (and  similarly  for  M  and  iV);  a 
process  is  closed  whenever  it  has  no  free  variables,  i.e.  fv(P)  =  0,  but  may  of  course 
contain  free  names.  Let  n*  be  a  distinguished  name  and  a  distinguished  label; 
then  the  programs  of  interest  are  ambients  of  the  form  n**  [P*]  where  P*  is  closed, 
where  n*  £  fn(P*)  and  where  /*  does  not  occur  in  P*. 

Example  1.  Consider  the  following  example  from  [7]  for  illustrating  how  an  agent 
crosses  a  firewall  using  the  prearranged  passwords  (or  secret  keys)  k,  k'  and  k": 

Firewall :  (i/w1")wA[kB[out1w.  in2k'.  in3w]  |  openV.  open5k,,.P) 

Agent  :  k'c[open6k.k"D[Q]) 

The  program  of  interest  is  n[* [Firewall  \  Agent].  We  use  typewriter  font  for  names, 
italics  for  stable  names,  roman  for  ambient  labels,  and  numbers  for  labels  of  capa¬ 
bilities.  o 


Semantics.  The  semantics  is  given  by  a  structural  congruence  relation  P  =  Q  and  a 
reduction  relation  P  — ►  Q  in  the  manner  of  the  7r-calculus.  The  congruence  relation 
is  inductively  defined  by  the  axioms  and  rules  of  Table  2;  apart  from  a  few  differences 
noted  below  it  is  a  straightforward  modification  of  a  table  in  [7] .  The  axioms  and 
rules  in  the  left  hand  column  ensure  that  the  relation  is  an  equivalence  relation, 
that  it  is  a  congruence,  that  parallel  composition  is  commutative  and  associative 
with  the  inactive  process  as  a  unit;  they  also  describe  the  behaviour  of  replication. 

The  rules  and  axioms  in  the  right  hand  column  of  Table  2  allow  us  to  change  the 
placement  of  restriction  operators.  In  the  axiom  for  (unlJ'n)(i'mpTn)P  we  have  added 
the  side  condition  “if  n  ^  m”  to  ensure  that  the  association  between  names  and 
stable  names  is  not  modified  by  the  structural  congruence.  Next  we  have  axioms  for 
a-conversion;  here  we  write  P{n  <—  m}  for  the  process  that  is  as  P  but  with  all  free 
occurrences  of  n  replaced  by  m  (taking  care  to  oconvert  so  as  to  avoid  the  capture 
of  m  by  any  restriction  operator).  The  final  two  axioms  control  the  expansion  of 
capabilities. 

The  reduction  relation  is  inductively  defined  by  the  axioms  and  rules  of  Table  3.  It 
is  as  in  [7]  and  a  pictorial  representation  of  the  five  basic  axioms  is  given  in  Figure 
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p  =  p 

P~QaQ  =  R=>P~R 
P  =  Q=>Q==P 

P  =  Q  =>  {un^)P  ~  (i/nM)Q 
P~Q=>  P\R  =  Q\R 
P  =  Q=>\P  =  \Q 
P  =  Q =>  W|f>]  =  AT'(Q] 

P  =  Q=>M.P  =  M.Q 
P  =  Q^(x0C).Ps(x0C).Q 
P  =  Q^((u0")).P  =  ((u0n)).Q 

P  \  Q  =  Q  \  P 

(P  |  Q)  |  R  =  P  |  (Q  |  R) 

P\0  =  P 

\Ps  P  |  \P 
!0  =  0 
(z/n“)0  ~  0 


if  n  7^  m 

(»n'i)(P\Q)  =  P\(vn‘i)Q 
if  n  ^  fn(P) 

(i/ 71*  )(m'[P])  =  rn'Ki/n")^ 
if  n  ^  m 

(vn^P  —  (i/mti)(P{n  <—  m}) 

if  m  ^  fn(P)  (a-renaming) 

(/).P=(/).(P(lHt'}) 

if  x*  £  fy(P)  (a-renaming) 

|/)).P=((/)).(P{Uku'}) 

if  u'  £  fv(P)  (a-renaming) 

(M.M').P  =  M.  M'.P 

e.P  =  P 


Table  2.  Structural  congruence. 


1.  The  remaining  rules  ensure  that  reduction  can  take  place  in  the  contexts  of 
restrictions,  ambients  and  parallel  compositions  and  that  the  structural  congruence 
can  freely  be  used  to  rearrange  ambient  expressions.  Note  that  no  internal  reduction 
can  take  place  “inside”  movement  or  input  prefixes.  Also  note  that  in  each  reduction, 
exactly  one  of  the  basic  axioms  is  used. 

Example  2.  We  have  the  following  sequence  of  reduction  steps  for  n1;  [Firewall  \ 
Agent]\  in  each  step  we  have  underlined  the  capability  to  be  executed  next  and  we 
have  assumed  that  w  £  fn(Q). 

nl;  [(v ) wA [kB [out 1  w.  inV.  tn3v]  |  open V.  open  V.  P]  |  k/C[open6k.k,,D[Q]]] 

-  ^/[(i/v^Jf^linV.in3^  |  wA[openV.op enV'.P]  |  k'C[open6k. k"D[Q]])] 

-►  ni‘[(z/wt")(wA[open4k'.open5k//.P]  |  k'C[kB[in3w]  [  open6k.  k//P[QH)] 

— +  n**  [(i/ww)(vA[open4k/.  open5^.  P]  |  k/C[in3w  |  k,/D[Q]])] 

— >  n[*  [(t/ww)wA [open4k'.open5k".  P  |  k/C[k"D[<2]]]] 

-►  n[*  [(t/vt°)wA[open5k//.  P  |  k"D[<?]]] 

—*  nlS  [(i/vw)vA[P  |  Q]] 

The  transition  sequence  shows  that  the  firewall  (which  has  the  private  name  w) 
sends  out  the  pilot  ambient  named  k;  since  the  agent  knows  the  right  passwords, 
and  is  in  the  right  form,  the  pilot  ambient  can  enter  the  agent  and  then  guide  it 
inside  the  firewall.  □ 

Properties  of  the  semantics.  Recall  that  the  programs  of  interest  are  ambients  of 
the  form  ni*  [P*]  where  P*  is  closed  (and  hence  contains  no  free  variables  but  may 
contain  free  names),  n*  £  fn (P*)  and  P*  does  not  contain  /*.  It  follows  that  only 
closed  expressions  are  reduced  and  that  no  new  names  become  free  and  that  no  new 
labels  come  into  existence.  Because  of  the  structural  congruence  it  is  not  the  case 
that  programs  evolve  into  programs  (in  particular  processes  of  the  form  nl*  [•••]);  to 
achieve  this  we  could  restrict  the  use  of  the  congruence  in  the  semantics.  Instead, 
we  shall  use  the  following  result  showing  that  programs  evolve  into  processes  that 
are  congruent  to  programs. 
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P  —  Q  =$-  (j /n")P  ->  (i sn^Q 

n'1  [in'2m.  P  |  Q]  |  m'3  [R]  m‘s  [n'1  [P  |  Q]  |  P] 

n'[P]  ->  n‘[Q\ 

to'1  [n's  [out'3  m.  P  |  Q\  \  fi]  ->  n'2  [P  |  Q]  |  to'1  [R] 

P-*Q=$-P\R—>Q\R 

open^n. P  |  n'2[Q]  — +  P  |  Q 

PsP1  'l 

(x0).P  \  (M)1  — >  P{x  <—  M} 

P’^Q’  }=>P-+Q 

Q'  =  Q  ) 

{(u0)).P  \  {(N))‘  _  P{u  «-  N} 

Table  3.  Reduction  relation. 


Fact  1.  Programs  evolve  to  programs  (modulo  the  congruence): 

If  P  is  congruent  to  a  program  and  P  — ►*  Q  then  also  Q  is  congruent  to  a 
program. 


Proof.  We  first  investigate  what  it  means  for  a  process  to  be  congruent  to  a  program. 
For  this  we  extend  the  syntax  of  Table  1  with 


T  ::=  {vn*)T 

I  o 

j  T\r 
i  IT 
I  L.T 


S::=(vn**)S  if  n^n* 

I  S\T 
I  T\S 

|  nl*  [P]  if  ni*  [P]  is  a  program 
|  L.S  . 


L  ::=  e  |  L.L' 

Let  Triv  be  the  set  of  processes  described  by  T  and  let  Ser  be  the  set  of  processes 
described  by  5;  the  former  clearly  contains  the  inactive  process  0  and  the  latter 
clearly  contains  all  programs.  We  now  show  that  Triv,  respectively  Ser,  is  the 
closure  of  the  set  {0},  respectively  the  set  of  programs,  under  the  congruence. 

It  is  immediate  to  prove  by  induction  in  L  that  L.T  =  T  and  L.S  =  S.  It  is  also 
immediate  to  prove  by  induction  in  T  that  T  =  0;  hence  all  processes  in  Triv  are 
congruent  to  0.  Furthermore,  by  induction  in  S  one  can  show  that  S  s  n**  [P]  for 
some  P  such  that  nl*  [P]  is  a  program;  hence  all  processes  in  Ser  are  congruent  to 
programs. 

For  the  opposite  inclusions  we  prove  that  if  P  =  Q  and  P  E  Triv  then  Q  E  Triv, 
and  similarly  that  if  P  =  Q  and  Q  E  Triv  then  P  €  Triv,  by  induction  in  the 
inference;  it  follows  that  Triv  is  the  set  of  processes  that  are  congruent  to  0.  Next 
we  prove  that  if  P  =  Q  and  P  E  Ser  then  Q  E  Ser,  and  similarly  that  if  P  =  Q 
and  Q  E  Ser  then  P  E  Ser,  by  induction  in  the  inference;  hence  Ser  is  the  set  of 
processes  that  are  congruent  to  programs. 

We  now  turn  to  the  statement  of  the  fact:  if  P  E  Ser  and  P  — >*  Q  then  Q  E  Ser. 
We  proceed  by  induction  in  the  length  of  the  derivation.  In  the  induction  step, 
P  — >*  R  — ►  Q  with  R  E  Ser,  we  consider  the  place  where  the  basic  axiom  is  used 
for  establishing  P  — >  Q.  Since  a  process  of  the  form  T  cannot  contain  any  labels  or 
binders  the  basic  axiom  used  for  R  — >  Q  cannot  involve  any  subprocess  of  the  form 
T.  It  follows  that  the  basic  axiom  must  take  place  inside  the  (necessarily  unique) 
occurrence  of  nl*  [•  •  ■]  in  R.  Inspection  of  the  five  basic  axioms  then  immediately 
establish  the  result.  □ 


It  should  be  clear  that  the  syntactic  annotations  in  no  way  influence  the  semantics. 
We  can  make  this  precise  as  follows.  Let  be  a  distinguished  stable  name,  let  /3£ 
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and  02  be  distinguished  binders,  and  let  ZJ,  Zj,  ZJ,  ZJ,  ZJ  and  ZJ  be  distinguished 
labels.  Given  a  process  P  write  [P\  for  the  process  where  all  stable  names,  binders 
and  labels  are  replaced  by  the  appropriate  distinguished  stable  names,  binders  and 
labels. 


Fact  2.  The  semantics  is  invariant  under  annotations: 

If  P  — Q  and  [Pj  =  [P'J  then  there  exists  Qf  such  that  P '  —>*  Q'  and 

IQJ  =  LQ'J- 


Proof.  We  first  consider  the  similar  statement  for  the  congruence: 

If  P  =  Q  and  [PJ  =  [P'J  then  there  exists  Q'  such  that  Pf  ~  Qr  and 
[QJ  =  LQ'J;  similarly  if  P  =  Q  and  [QJ  =  [Q'J  then  there  exists  P'  such 
that  P'  =  Q'  and  [P\  =  [P'J. 

It  is  proved  by  induction  in  the  inference  tree  for  P  =  Q. 

We  then  prove  the  statement  of  the  fact  by  induction  in  the  length  of  the  derivation 
P  — >*  Q.  For  the  induction  step  P  ->*  R Q  we  proceed  by  induction  in  the  shape 
of  the  inference  tree  for  R  — >  Q.  □ 


3  Control  Flow  Analysis 


Immediate  constituents  of  ambients.  The  main  aim  of  the  control  flow  analysis 
is  to  obtain  the  following  information  for  each  ambient:  (i)  which  ambients  may 
be  immediately  contained  in  it,  (ii)  which  capabilities  may  it  perform,  (Hi)  which 
input  actions  may  be  performed  immediately  inside  it,  and  (iv)  which  output  actions 
may  be  performed  immediately  inside  it.  An  ambient  will  be  identified  by  its  label 
Za  €  Laba,  an  in-capability  by  its  label  Z1  E  Lab1,  an  out- capability  by  its  label 
1°  E  Lab°,  an  open-capability  by  its  label  lp  €  Labp,  an  input  of  a  capability  by  its 
binder3  0C  £  Bndc,  an  input  of  a  name  by  its  binder  0n  E  Bndn,  the  output  of  a 
capability  by  its  label  Zc  E  Labc,  and  the  output  of  a  name  by  its  label  Zn  E  Labn. 

The  analysis  records  this  information  in  the  following  component: 


I  E  InAmb  =  Laba 


(  Laba  U  Lab*  U  Lab°  U  Labp  U  ^ 
y  Labc  U  Labn  U  Bndc  U  Bnd71 


When  specifying  the  analysis  we  shall  also  use  the  “inverse”  mapping 
I~l  :  (Lab  U  Bnd)  ->  P(Laba) 

that  returns  the  set  of  ambients  in  which  the  given  ambient,  capability,  input  or 
output  might  occur;  formally  z  E  I(la)  if  and  only  if  la  E  I~1(z).  Later  we  shall 
write  2  E  I+(l)  to  mean  that  there  exists  lx ,*-*,Zn  (for  n  >  1)  such  that  Z  =  Zi, 
z  =  Zn,  and  Vi  <  n  :  Zi+i  E  I(li). 


3  This  actually  confuses  binders  with  labels;  it  would  be  notationally  purer,  but  somewhat 
heavier,  to  demand  input  actions  to  be  annotated  not  only  with  a  binder  but  also  with 
a  label. 
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Stable  names  of  ambients  and  capabilities .  Ambients  and  capabilities  have  stable 
names  associated  with  them  and  to  keep  track  of  this  information  the  analysis  also 
contains  the  following  component: 

H  €  HNam  =  (Lab°  U  Lab1  U  Lab°  U  Labp)  — ►  'P(SNam) 

As  above  we  shall  also  use  the  “inverse  mapping” 

H~l  :  SNam  -►  V{Laba  U  Lab£  U  Lab°  U  Labp) 

that  returns  the  set  of  ambients  that  might  have  the  given  stable  name;  formally 
H(l)  if  and  only  if  l  6  The  information  in  H  is  needed  to  determine  the 

ambients  operated  upon  by  the  capabilities  so  as  to  accurately  update  the  contents 
of  /. 

Naming  environment.  The  association  between  free  names  and  variables,  and  their 
stable  names  and  binders,  is  expressed  by  a  naming  environment: 

me  e  MEnv  =  (Nam  U  Varc  U  Var71)  — >Rn  (SNam  U  Bndc  U  Bndn) 

such  that  me(n)  €  SNam,me(j:)  6  Bndc,me(u)  €  Bndn 

Here  we  impose  the  condition  that  the  marker  environment  “preserves  the  types” 
of  names,  variables  ranging  over  capabilities  and  variables  ranging  over  names. 

We  shall  write  me*  for  the  initial  naming  environment  for  the  program  nl*  [P*]  of 
interest  and  dom(me*)  for  its  finite  domain.  Recall  that  for  n[*  [P*]  to  be  a  program 
we  demand  that  P*  is  closed,  that  n*  ^  fn(P*)  and  that  P*  does  not  contain  /*.  For 
(me*,ni*  [P*])  to  constitute  a  program  of  interest  we  then  demand  that: 

n[*  [P*]  is  a  program, 

me*  defines  all  the  free  names  of  n**[P*],  i.e.  fn(ni*[P*])  C  dom(me*),  and 
me*  does  not  define  any  variables,  i.e.  dom(me*)  C  Nam,  and 

the  stable  name  /i*  is  distinguished,  does  not  occur  in  P*,  and  is  only  pos¬ 
sessed  by  n*,  i.e.  me^1^*)  =  {n*}. 

(Here  we  write  me*l(p)  for  the  set  {n  |  me*(n)  =  /i}  of  names  mapped  to  fi.) 

Communication  and  stable  capabilities.  The  environment  R  is  responsible  for  col¬ 
lecting  the  values  that  can  be  bound  to  the  variables  as  result  of  an  input  action. 
The  variables  are  represented  by  their  binders.  In  the  case  of  input  of  names  it  is 
natural  to  represent  the  name  by  its  stable  name.  In  a  similar  way,  in  the  case  of 
input  of  capabilities,  we  shall  represent  the  capability  by  its  stable  capability: 

m  6  SCap 
m  ::=  inr  |  outr  |  open/P 

There  are  no  stable  capabilities  corresponding  to  null  capabilities  (e)  and  paths 
(Mi  .M2);  instead  the  analysis  will  break4  them  into  the  set  of  constituent  in-,  out-, 
and  open-capabilities. 

The  communication  box  C  is  responsible  for  collecting  the  effects  of  the  output 
actions  taking  place  immediately  inside  the  ambient:  again  the  ambient  is  identified 

4  Clearly  a  more  precise  analysis  can  be  devised;  however,  we  choose  the  simpler  approach 
so  that  the  constraint  solver  of  Subsection  3.3  only  needs  to  operate  in  a  finite  universe 
and  hence  can  compute  solutions  explicitly  in  polynomial  time. 
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by  its  label  and  the  value  being  communicated  will  be  a  stable  capability  or  a  stable 
name: 

R  =  ( Rc ,  Rn)  e  Env  =  (Bndc  -►  P(SCap))  x  (Bndn  P(SNam)) 

C  =  (Oc,On)  e  Comm  =  (Laba  -  P(SCap))  x  (Lab°  -  P(SNam)) 

Also  the  information  in  R  and  O  will  be  needed  to  accurately  update  the  contents 
of  7  in  the  presence  of  communication. 

Example  3.  Consider  the  program  n1;  [Firewall  \  Agent]  of  Example  1  and  the  fol¬ 
lowing  analysis  estimate  (where  the  initial  naming  environment  me*  maps  the  names 
n*,  k,  k'  and  k"  to  /x*,  fc,  kf  and  k ",  respectively): 


/(/*)  ={A,B,C} 

1(A)  ={1,2,3,4,5,6,A,B,C,D} 
7(B)  ={1,2,3} 

7(C)  ={U,3,6,A,B,C,D} 
7(D)  =0 


label 

h 

A 

B 

c 

D 

1 

2 

3 

4 

5 

6 

77 

{/M 

M 

{k} 

{*'} 

{*"} 

{*'} 

M 

{*"} 

This  shows  that  the  ambient  labelled  A  might  perform  transitions  consuming  any  of 
the  capabilities  labelled  1-6  and  that  it  might  contain  any  of  the  ambients  labelled 
A-D;  in  particular  it  might  contain  the  ambient  labelled  C  indicating  that  the  agent 
might  enter  the  firewall  -  and  as  shown  in  Example  2  this  indeed  happens. 

No  communication  is  taking  place  and  it  is  therefore  safe  to  set  RC(0C)  =  0  and 
Rn(Pn)  =  0  for  all  binders  and  to  set  Cc(la)  =  0  and  Cn(la)  =  0  for  all  labels.  □ 


3.1  The  acceptability  relation 


The  acceptability  of  a  control  flow  estimate  is  defined  by  the  following  four  predi¬ 
cates  (defined  in  Tables  4  and  5  and  explained  below): 


(7,77,0,/?)  h LP 
(7,77,0,7?)  | >meM:M 

(7,77,0,7?)  \^meN:N 

(7,77,0,7?)  \=lrh 


for  checking  a  process  P  €  Proc; 

for  translating  a  capability  M  €  Cap  into  a 
set  M  €  P(SCap)  of  stable  capabilities; 

for  decoding  a  naming  N  €  Nam  into  a  set 
N  €  T^SNam)  of  stable  names; 

for  checking  a  stable  capability  m  e  SCap. 


Analysis  of  processes.  Table  4  gives  a  simple  syntax-directed  definition  of  what  it 
means  for  an  analysis  estimate  (7,77,0,7?)  to  be  acceptable  for  the  process  P.  The 
predicate  is  defined  relative  to  the  current  naming  environment  me  and  the  current 
label  l  of  the  enclosing  ambient.  The  naming  environment  is  updated  whenever  we 
pass  through  a  restriction  operator  or  an  input  and  the  label  is  updated  whenever 
we  pass  inside  a  new  ambient.  Note  that  the  analysis  cannot  distinguish  between 
whether  a  process  occurs  only  once  or  many  times:  \P  and  P  are  analysed  in  the 
same  way  (as  are  P  \  P  and  P). 

The  clause  for  ambients  Nl  [P]  first  checks  the  subprocess  P  using  the  appropriate 
naming  environment  and  label.  It  then  demands  that  the  label  of  the  ambient  is 
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(I,H,C,R)  Nine  (•'«'*)* P 

iff  (I,H,C,R)  NL[n„„|  p 

(. I,H,C,R )  Hn«0 

iff  true 

(I,H,C,R)  Nine  P  |  P' 

iff  (/,  H,  C,  R)  Nine  P  A 

(I,H,C,R)\=  lme  P' 

(I,H,C,R)  \=lme\P 

iff  (I,H,C,R)^lmeP 

(I,H,C,R)  Nine  Ar'“[P] 

iff  (I,  H,C,R)  Nine  P  A  lae  1(1)  A 

(/,  H,  C,  R)  INme  N  :  N  A  N  C  H(la) 

(I,H,C,R)  Nine  M.P 

iff  (I,H,C,R)  Nine  P  A 

(/,  H,  C,  R)  |>me  M  :  M  A 

Vm  6  M  :  (/,  H,  C,  R)  N' m 

(I,H,C,R)  NL  ( Mf 

iff  lc  e  1(1)  A  (/,  H,  C,  R)  [>m eM-.M  A 

Via  6  ir~1(Jc) :  Cc(la)  D  M 

(I,H,C,R)  bL  (mr 

ig  re  i(i)  a  (i,  h,  c,  R)  |Nme  n  ■.  n  a 

vr  er1(in):Cn(ia)DN 

(I,H,C,R)  Nine  (x0C).P 

iff  (I,H,C,R)  Nme[iw)3c]  P  A  F  6  1(1)  A 

6  /_1(<3C)  :  C°(la)  C  RC(PC) 

(I,H,C,R)  Nine  ((U0")).P 

iff  (I,H,C,R)\=‘melu^n]P  A  0"  e  1(1)  A 

vi“  e  r1(0n) :  cn(ia)  c  Rn(0n) 

Table  4.  Control  flow  analysis  (for  processes). 


recorded  as  being  inside  the  current  label  Finally,  it  demands  that  the  stable  name 
of  the  ambient  is  recorded  as  being  a  name  of  the  ambient.  Intuitively,  N  is  the 
singleton  {me(n)}  when  TV  is  n;  this  is  made  precise  by  Table  5  explained  below. 

As  in  Prolog,  all  free  identifiers  (like  TV  and  M)  on  the  right-hand  sides  of  clauses 
are  implicitly  assumed  to  be  existentially  quantified;  this  means  that  whenever  a 
clause  is  applied  we  are  free  to  supply  suitable  values  for  these  identifiers. 

The  clause  for  movement  M.  P  first  checks  the  subprocess  P  using  the  appropriate 
naming  environment  and  label.  It  then  translates  the  capability  M  into  the  set  of 
stable  capabilities  M.  Intuitively  M  is  the  singleton  {in*  }  when  M  is  in*  n  and 
similarly  for  the  other  capabilities;  this  is  made  precise  by  Table  5  explained  below 
(that  also  takes  care  of  associating  the  stable  name  of  n  with  the  label  P). 

The  two  clauses  for  output  actions  first  record  that  the  action  may  take  place  inside 
the  enclosing  ambient.  The  next  step  is  to  translate  the  capability  M  (resp.  the 
name  TV)  into  a  set  of  stable  capabilities  M  (resp.  a  set  of  stable  names  TV)  thereby 
making  it  independent  of  the  context.  The  communication  box  C  is  then  updated 
to  record  that  each  ambient  la  (including  l)  that  could  contain  the  output  action 
will  in  fact  witness  the  output  of  M  (resp.  TV). 

Finally,  the  two  clauses  for  input  action  first  update  the  environment  me  to  record 
the  binding  of  the  variable  before  analysing  the  subprocess.  It  is  then  ensured  that 
the  component  /  contains  the  appropriate  binder  (f3c  or  f3n )  representing  the  input 
action.  To  determine  the  possible  value  being  communicated  (and  hence  bound  to 
the  variable  represented  by  the  binder)  we  have  to  consult  the  communication  box 
of  the  enclosing  ambient.  So  for  all  ambients  la  (including  l)  that  might  contain  the 
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(/,  H,  C,  R)  \>mc  in'*  AT :  M  iff  ( I,H,C,R )  pme  N  :  N  A 

M  D  {in'*}  A  H(ll)DN 

(. I,H,C,R )  \>mc  out l°N  :  M  iff  ( I,H,C,R )  | |=m,  N  :  N  A 

M  D  {out'°}  A  H(l°)DN 

(I,  H,  C,  R)  t>me  open'” TV  :  M  iff  (/,  H,  C ,  il)  ||=me  N  :  N  A 

M  D  {open'”}  A  H(lp)  2  AT 


(/,  H,C,R)  \>mc  x :  M  I  M  D  Rc(me(x)) 

(/,  H,  C,  R)  f>me  e  :  M  iff  M  D  0 

(/,  H ,  C,  ii)  >me  Mi  ,M2  :  M  iff  (/,  H, C,  R)  |>me  Mi  :  Mi  A 

(/,H,C,fl)>meM2:M2  A 
M  2  Mi  U  M2 


(/,  if,  C,  /l)  n  :  TV  iff  N  2  {me(n)} 

(/,  H,  C ,  fl)  |Nme  u  :  Af  iff  IV  2  /T  (me(u)) 


(I,H,C,R)  (s'  in‘‘ 


(I,H,C,R)  out'° 


(I,H,C,R)  |='  open'” 


iff  6  1(1)  A 

vr  G  /_1(r) :  V/i  €  tf(i')  :  V/°'  G  /“‘(l0) : 
vr"  g  /(r')ntf~V)nLab“ : 

f“  G  /(/“") 

ig  r  g  i(i)  a 

V/°  G  I~l(l°) :  V/i  G  H(l°)  : 

vr'  G/_1(i“)n/r1(/i)  : 

vr"  G  rl(la')  :la  G  /(/“") 


iff  P  G  /(0  A 

vr  G  :  V/i  G  ff(lp)  : 

vr'  G  /(l“)n/r‘(|j)nlab“  : 

Vi'  G  /(<*')  :  /'  G  /(/“) 


Table  5.  Control  flow  analysis  (for  capabilities  and  namings). 


r 

[ 

M  :  in1'  |  £ 

:  l°" 
□ 

1 

[S 

II 

Fig.  2.  Pictorial  representation  of  the  analysis  of  atomic  capabilities. 
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input  action  we  record  that  the  contents  of  the  communication  box  C(la)  can  be  a 
value  of  the  binder  and  we  use  the  environment  R  to  capture  this. 

Translation  of  capabilities  and  namings.  The  first  two  parts  of  Table  5  translate 
capabilities  and  namings  into  a  form  where  they  are  independent  of  the  context. 
For  capabilities  we  return  (in  M)  the  stable  capability  and  in  the  H  component  the 
stable  names  relating  to  the  capability.  It  would  have  been  very  natural  to  follow 
[18]  in  bypassing  the  H  component  for  this  purpose  and  to  include  the  stable  name 
in  the  stable  capability;  the  reason  for  not  doing  so  is  to  allow  approximations  that 
will  yield  faster  analyses  than  possible  using  the  approach  of  [18]. 

The  entities  recorded  in  R  come  into  play  in  the  clause  for  translating  variables  to 
stable  capabilities  and  stable  namings;  as  an  example,  for  a  capability  variable  x  we 
consult  me  and  Rc  to  determine  the  possible  stable  capabilities  that  x  might  stand 
for.  The  two  clauses  for  null  capabilities  and  paths  show  how  capabilites  are  broken 
up  into  their  atomic  constituents;  as  mentioned  earlier  this  is  to  facilitate  the  de¬ 
velopment  of  a  simple  constraint  solver  for  implementing  the  analysis  in  polynomial 
time. 

The  form  of  the  judgements  for  translating  capabilities  and  namings  combine  the 
verbose  and  succinct  forms  of  flow  logic  [16].  The  verbose  format,  as  used  for  the 
analysis  of  processes  and  capabilities,  explicitly  contains  a  record  of  the  informa¬ 
tion  as  it  pertains  to  all  internal  program  points;  this  is  part  of  the  (/,  H,  C,  R) 
component.  The  succinct  format,  to  be  specific  the  M  and  N  components  of  the 
judgements  for  translation,  directly  expresses  auxiliary  information  that  is  only  of 
local  interest.  The  use  of  succinct  components  frequently  make  specifications  more 
readable  and  tend  to  give  them  the  flavour  of  type  systems.  The  relationship  be¬ 
tween  verbose  and  succinct  specifications  is  studied  in  [21]. 


Analysis  of  capabilities.  The  last  part  of  Table  5  shows  how  to  check  stable  capabil¬ 
ities  against  the  control  flow  estimate  (/,  H ,  C,  R).  Figure  2  illustrates  these  clauses 
pictorially;  the  similarity  between  Figures  2  and  1  stresses  the  systematic  way  in 
which  a  control  flow  analysis  may  be  developed  from  a  formal  semantics  and  we 
regard  this  as  a  strong  point  of  our  approach. 

The  clause  for  in*'  first  ensures  that  the  stable  capability  is  properly  recorded  as 
part  of  the  current  ambient  l.  Then  it  ensures  that  all  contexts  la  in  which  the 
capability  could  occur  (and  this  clearly  includes  l)  are  properly  recorded  as  being 
possible  subambients  of  all  sibling  ambients  la "  having  a  stable  name  p  associated 
with  in*.  This  involves  quantifying  over  all  possible  parent  ambients  la  and  using 
the  component  H  to  obtain  the  stable  name  of  the  ambient  that  la  indicates. 

The  clause  for  out*°  follows  a  similar  pattern.  First  it  ensures  that  the  stable  capa¬ 
bility  is  recorded  as  part  of  the  current  ambient  Z.  Next  it  ensures  that  all  contexts  la 
in  which  the  capability  could  occur  (and  again  this  includes  Z)  are  properly  recorded 
as  being  possible  ambients  in  all  the  possible  grandparents  la  provided  that  the 
parent  Za*  has  a  stable  name  p  associated  with  out*” 

For  the  stable  capability  open*P  we  once  again  start  by  ensuring  that  it  is  properly 
recorded  as  part  of  the  current  ambient  Z.  Then  we  consider  all  contexts  la  in  which 
the  capability  could  occur  (and  again  this  includes  Z)  and  find  all  subambients  la 
having  a  stable  name  p  associated  with  open*P ;  these  are  opened  by  ensuring  that 
whatever  is  included  in  the  subambient  la  also  occurs  in  the  parent  ambient  la. 

It  is  crucial  to  observe  that  we  need  to  consult  all  possible  contexts  la  in  which 
the  capability  could  occur  and  not  just  the  obvious  candidate  Z.  This  is  because,  in 
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order  to  establish  semantic  soundness,  the  analysis  has  to  take  into  account  that 
the  current  ambient  might  be  dissolved  by  an  open-capability. 


Example  4 •  Let  us  check  the  condition 

(/,  H ,  C,  R)  |=4e  kB(out‘w.  inV.  in3w] 

that  arises  when  checking  that  the  analysis  estimate  (/,  H,  C,  R)  of  Example  3 
correctly  validates  the  program  nl;  [Firewall  |  Agent ]  of  Example  1;  here  the  naming 
environment  me  maps  n*,  k,  k',  k"  and  w  to  k,  k k"  and  w,  respectively.  First  we 
decide  to  let  N  be  {k}.  We  then  need  to  check  that  (I,  H,  C,  R)  ||=me  k  :  {A:}  (which 
follows  from  the  choice  of  me),  that  {k}  C  H( B)  (which  follows  from  Example  3), 
that  B  G  1(A)  (which  once  more  follows  from  Example  3)  and  that  (I,  H,  C,  R)  |=®  e 
out1w.in2k'.in3u  (see  below). 

To  check  that  (I,  H, C,  R)  (=Be  out^w.  in2k'.  in3w  we  first  decide  to  let  M  be  {out1}. 
We  then  need  to  check  that  (I,H,C,R)  t>me  out'v  :  {out1}  (which  follows  from 
(■ I,H,C,R )  |Nme  w  :  {«;}  and  H(  1)  2  {w}),  that  ( I,H,C,R )  |=B  out1  (see  below) 
and  that  ( I,H,C,R )  f=B  e  in2k'.  in3w  (which  amounts  to  twice  repeating  the  checking 
procedure  being  illustrated  for  ou^w). 

Finally,  let  us  check  that  (I,H,C,R)  |=B  out1.  First  we  check  that  1  €  7(B)  (using 
Example  3).  For  the  second  condition  we  have  la  G  7-1(  1)  =  {A,B,C}  and  \i  G 
H(  1)  =  {w};  for  each  of  the  choices  for  la  we  have  la>  G  7"1(/a)  D  H~1(w)  = 
{/*,  A,  C}  D  {A,  1,3}  =  {A}  so  the  parent  ambient  la'  of  la  will  always  be  A.  The 
grandparent  of  la  is  la  G  7“1(A)  =  {£*,  A,C}  so  the  second  condition  amounts  to 
checking  that  all  of  A,  B  and  C  are  elements  of  all  of  1(1*),  1(A)  and  7(C)  and  clearly 
this  is  the  case.  □ 


3.2  Properties  of  the  analysis 

In  the  terminology  of  data  flow  analysis  [17]  the  above  analysis  is  flow-insensitive 
since  we  ignore  the  order  in  which  the  capabilities  occur;  also  it  is  context-insensitive 
(or  monovariant)  since  a  capability  is  analysed  in  the  same  way  for  all  contexts 
in  which  it  occurs.  We  refer  to  [14,19,22]  for  more  precise  ways  of  analysing  the 
communication-free  fragment. 


Semantic  correctness.  Having  specified  what  it  means  for  an  analysis  estimate 
(7, 7f,  C,  R)  to  be  acceptable  the  next  step  is  to  show  that  the  notion  of  acceptability 
is  semantically  meaningful.  We  begin  by  establishing  some  auxiliary  properties. 

Fact  3.  The  analysis  enjoys  the  following  monotonicity  properties: 

(i)  If  (/,//, C,R)  |=hc  p  _  and  I(l\)  C  1(h)  then  (I,H,C,R)  (=^e  P. 

(ii)  If  (I,H,C,R)  1 >mc  M  :  Mi  and  Mi  C  M2  then  (I,H,C,R)  t>me  M  :  M2- 

(Hi)  If  (/,  H,  C,  R)  |t=me  N  :  Ni  and  Nx  C  N2  then  (/,  //,  C,  R)  |j=mc  N  :  iV2. 

(iv)  If  (/,  H,  C ,  R)  \=l1  rh  and  1(h)  C  I(l2)  then  (/,  H,  C,  R)  ^<2  m. 

Proof.  It  is  immediate  to  prove  (Hi)  by  inspection  of  the  two  cases  for  N. 

We  then  prove  (ii)  by  inspection  of  the  six  cases  for  M  and  all  but  one  case 
is  immediate.  In  the  case  for  Mi.M2  we  have  ( 7, if, C, R)  [>me  A7i  :  and 

(I,H,C,R)  I >me  M2  :  M2  for  some  M[  and  M'2  such  that  Mi  3  M[  U  M2;  it 
is  then  immediate  that  M2  3  M[  U  Mr,  as  was  to  be  shown. 
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Next  we  prove  (iv)  by  inspection  of  the  three  cases  for  m.  All  cases  are  immediate 
since  the  label  h  (for  i  =  1, 2)  is  only  used  to  establish  a  fact  of  the  form  l f  €  1(h). 

Finally  we  prove  (i)  by  structural  induction  in  P.  This  is  straightforward  because 
the  label  h  (for  i  =  1,2)  is  only  used  in  recursive  calls,  to  establish  a  fact  of  the 
form  l'  €  1(h)  or  /?'  €  1(h) ,  or  to  invoke  (iv).  □ 

To  express  the  next  fact  we  shall  write  mei  =p  me 2  to  mean  that  mei  and  me 2  are 
equal  on  the  free  names  and  variables  of  P;  in  a  similar  way  we  write  mei  ~M  me 2 
and  mei  =  jv  me2  for  capabilities  M  and  namings  N. 

Fact  4.  The  analysis  only  depends  on  the  free  names  and  variables: 

(i)  If  mei  —p  me 2  and  (7,77,C,  P)  |=mei  P  then  (7,  77,C,P)  [ =Le2  P- 

(ii)  If  me  1  =m  me2  and  (7,77,  C,P)  t>mei  M  :  M  then  (7, 77,(7,  P)  |>me2  M  :  M. 

(Hi)  If  mei  =n  me 2  and  (7, 77,  C ,  R)  j^me!  N  :  N  then  (7, 77,  C,  R)  |^me2  N  :  N. 

Proof.  The  proof  of  (in)  is  immediate  by  inspection  of  the  two  cases  for  N.  Next  the 
proof  of  (ii)  is  straightforward  by  structural  induction  in  M  and  using  (Hi).  Finally 
the  proof  (i)  is  by  a  straightforward  structural  induction  using  (ii)  and  (Hi);  in 
particular,  the  induction  hypothesis  still  applies  in  the  cases  of  restriction,  input  of 
capabilities  and  input  of  names,  where  the  naming  environment  is  updated.  □ 

Lemma  1.  The  analysis  is  invariant  under  the  congruence: 

If  P  =  Q  then  (/,  H,  C,  R)  bL  P  if  and  only  if  (/,  H ,  C,  R)  \=‘me  Q. 

Proof.  The  proof  is  by  induction  on  the  proof  of  P  =  Q  and  relies  on  Fact  4.  Most 
cases  are  straightforward  and  we  just  illustrate  one  of  the  more  interesting  ones. 

Consider  the  case  (vn^)(P  \Q)  =  P  \  (un^)Q  where  n  $  fn(P).  It  is  immediate 
from  the  clauses  of  Table  4  that 

(I,H,C,R)  bL  (vn»)(P\Q) 

is  equivalent  to  (7, 77,  C,  R)  t=me[ni-»/d  ^  I  Q  anc^  hence  to 

(I,H,C,R)  bL[„~M]p  A  Q- 

Since  n  £  f n(P)  it  follows  from  Fact  4  that  this  is  equivalent  to 

(I,H,C,R)  bL-P  A  (I,H,C,R)  hieing]  Q 

and  using  the  clauses  of  Table  4  this  is  equivalent  to 

(I,H,C,R)  \=lmeP\(vn»)Q 


as  desired.  □ 

In  order  to  establish  the  Subject  Reduction  Result  we  shall  additionally  need  the 
following  central  result  showing  how  substitutions  work: 

Lemma  2.  The  analysis  enjoys  the  following  substitution  properties: 

(i)  If  (/,  H,  C,  R)  \>me  M  :  Rcm  and  (I,  H,  C,  R)  b'me[*~M  P 
then  (I,H,  C,  R)  bL  **[*«-«]• 

(ii)  If  (I,  H,  C,  R)  |bme  N  :  Rn(Pn)  and  (7, 77,  C,  R)  \=lme[u^}  P 
then  (7,77,(7,77)  bL  P[u<-N\. 
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Proof.  As  a  preparation  for  proving  (ii)  we  first  prove 

if  ( I,H,C,R )  INme  N  :  Rn((3n)  and  (I,H,C,R)  N' :  N' 

then  (I,  H ,  C,  R)  |^mc  N'[u  «-  N] :  N' 

by  inspection  of  the  two  cases  for  N'.  Next  we  prove 

if  ( I,H,C,R )  Pme  N  :  Rn(0n)  and  (I,H,C,R)  t>meIuM/Jn,  M  :  M 
then  (/,  H,  C ,  R)  |>me  M[u  <-N\-.M 

by  structural  induction  in  M  and  using  the  result  just  established;  the  proof  is 
straightforward.  We  are  then  ready  to  prove  (ii)  by  structural  induction  in  P  and 
using  the  two  results  just  established;  also  this  proof  is  straightforward. 

As  a  preparation  for  proving  (i)  we  first  prove 

if  ( I,H,C,R )  t>me  M  :  RC(/3C)  and  ( I,H,C,R )  M'  :  M'  then 

(I,H,C,R)\>meM'[x<-M):M' 

by  structural  induction  in  M';  the  proof  is  straightforward.  We  are  then  ready  to 
prove  (i)  by  structural  induction  in  P  and  using  the  result  just  established;  also 
this  proof  is  straightforward.  □ 

Theorem  1.  Subject  Reduction: 

If  (J,  if,  C,  R)  \=lme  P  and  P  ->*  Q  then  ( / ,  H,  C,  R)  h lme  Q . 

Proof.  The  proof  is  by  induction  in  the  length  of  the  derivation.  In  the  induction 
step,  P  S  — »  Q,  we  have  (/,P,  C,  R)  (=me  &  from  the  induction  hypothesis  and 
need  to  show  (/,  H ,  C,  R)  \=lme  Q. 

We  proceed  by  induction  in  the  transition  S  ^  Q.  We  first  consider  the  reduction 
rules  in  the  left-hand  side  of  Table  3.  The  proof  for  the  first  three  are  straightforward 
using  the  induction  hypothesis  and  the  specification  of  Table  4;  the  proof  for  the 
fourth  reduction  rule,  the  one  involving  the  congruence,  is  a  direct  consequence  of 
the  induction  hypothesis  and  Lemma  1.  We  next  turn  our  attention  to  the  basic 
axioms  in  the  right-hand  side  of  Table  3.  The  proof  for  the  in-  and  out-capabilities 
are  straightforward  using  the  specifications  of  Tables  4  and  5.  In  the  case  of  the  open- 
capability  we  additionally  make  use  of  Fact  3.  Finally,  in  the  case  of  communication 
we  make  use  of  Lemma  2.  This  concludes  the  proof.  □ 

As  a  consequence,  if  (/,  H)  C,P)  is  an  acceptable  analysis  estimate  for  the  program 
(me*,ni*[P*])  of  interest  then  it  will  continue  being  so  for  all  the  derivatives  of  the 
program. 


Existence  of  analysis  estimates.  So  far  we  have  only  shown  how  to  check  that  a 
given  estimate  (/,//,  C,P)  is  indeed  an  acceptable  analysis  estimate;  we  have  not 
studied  (i)  whether  or  not  acceptable  analysis  estimates  always  exist,  and  if  they 
do,  (ii)  whether  or  not  there  always  is  a  least  analysis  estimate. 

To  obtain  these  results  we  shall  show  that  the  set  of  acceptable  analysis  estimates 
constitutes  a  Moore  family  (sometimes  called  a  model  intersection  property): 

A  subset  Y  of  a  complete  lattice  (L,  C)  is  a  Moore  family  whenever  Yf  CY 
implies  that  fl Yf  e  Y. 
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By  taking  Yf  =  0  we  see  that  a  Moore  family  Y  cannot  be  empty  and  by  taking 
Y*  =  Y  we  see  that  it  always  contains  a  least  element;  this  will  be  essential  for 
answering  (i)  and  (ii)  in  the  affirmative. 

In  our  setting  the  complete  lattice  of  interest  is  the  set 

InAmb  x  HNam  x  Comm  x  Env 

of  tuples  of  mappings  ( 7,  H ,  C,  R)  and  the  ordering  is  the  pointwise  extension  of  the 
subset  ordering.  It  follows  that  greatest  lower  bounds  are  calculated  in  a  pointwise 
manner: 


Oj,  Rj)  —  {njeJlj ,  r\j£jHj)  r\j£jCj,nj£jRj) 

Since  7  C  7'  holds  if  and  only  if  7_1  Q  V~l  we  also  have  (n j<=jlj)~l  =  nygj^r1). 
Finally,  recall  that  \ljej  •  •  •  produces  the  greatest  element  T  when  J  is  the  empty 
set. 

We  can  now  prove  that  the  set  of  acceptable  control  flow  estimates  constitute  a 
Moore  family  and  hence  that  there  always  is  a  least  estimate: 

Theorem  2.  The  set  {( I,H,C,R )  |  (I,  H,  C,  R)  \=lme  P}  is  a  Moore  family  for  all 
Z,  me  and  P. 

Proof.  The  proof  is  in  four  parts.  The  first  parts  amounts  to  proving  that 

{(/,  H ,  C,  R,  N)  |  (/,  77,  C,  R)  | Nme  N  :  N} 

is  a  Moore  family  for  all  me  and  N.  This  is  immediate  by  inspection  of  the  two 
cases  for  N. 

The  second  part  establishes  that 

{(I,H,C,R)\(I,H,C,R)  N'  rh} 

is  a  Moore  family  for  all  l  and  rh.  We  proceed  by  inspection  of  the  three  cases  for 
rh.  In  the  case  of  in-capabilities  we  assume  that 

Y?  €  J  :  {IjtHj,Cj,Rj)  |s'  in'*  (1) 

and  note  that  then  P  G  Ij(l)  for  all  j  G  J  and  hence  P  E  (n jejlj)(l).  Next  con¬ 
sider  Za,  fi,  la'  and  la"  such  that  la  E  (n V  €  (n la  E 
(n jejlj)~l(la),  la"  €  (n jeJIj)(la')  fl  nLaba.  It  is  immediate  that 

we  then  have  la  E  1^(1%  fl  E  ifj(P),  la'  E  1^(1%  K-€  Is{la‘)  H  O  Laba 

for  all  j  E  J;  it  then  follows  from  (1)  that  la  E  Ij{la”)  for  all  j  E  J  and  hence  that 
la  €  (njeJIj)(la").  In  conclusion  we  then  have 

(n3e J (Ij ,Hj,Cj,Rj))\=l  \nl> 

as  desired.  The  cases  of  out-  and  open-capabilities  are  similar. 

The  third  part  shows  that 

{(/,  H,  C,  jR,  M)  |  (7, 77,  C,  R)  ^>me  M  :  M} 

is  a  Moore  family  for  all  M  and  me.  The  proof  is  by  structural  induction  in  M  and 
let  us  consider  the  case  Mi. M2.  Here  we  assume  that 

Vj  E  J  :  {I j ,  Hj , Cj ,  Rj )  |>me  Mi. M2  :  My 
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Hence  there  exist  families  ( and  {Mj2)j€j  such  that 

Vj  G  J  :  {IjyHj^CjjRj)  \>me  Mx  :  Mn 
VJ  €  •/  :  (Ij ,  Hj ,  Cj ,  Rj )  |>me  M2  •  Mj2 
Vj  €  J  :  Mj  D  Mji  U  Mj2 

By  the  induction  hypothesis  it  follows  that 

0""l?€  j(^i>  Hj)  Cj)  Rj))  |> me 

)  Hj  i  Cj ,  -Rj))  |>me  M2  :  Oj(:jMj2 

(n jejMj)  5  (nj€jMji)  u  (njGjMj2) 

so  that 


(Rj€  j{Iji  Hjy  Cjy  Rj))  |>me  M\.M2  \  C\j£jMj 
as  desired.  The  remaining  cases  are  similar. 

Finally,  the  fourth  part  of  the  proof  establishes  the  statement  of  the  theorem.  It 
is  proved  by  structural  induction  in  P  and  involves  no  new  methods  of  reasoning 
beyond  those  already  illustrated;  we  therefore  only  illustrate  the  case  7V/a[P]  of 
ambients.  Here  we  assume  that 

Vj  E  J  '■  (Ij,  Hj,Cj,  Rj)  bL  Nl°[P) 

It  follows  that  there  exists  a  family  {Nj)j^j  such  that 

V76J:  {Ii,HhCitRi)y^P 
VjeJ  -.  lae  Ij{l) 

Yj  e  J:  -,Nj 

Vj  6  J  :  Nj  c  Hj(la) 

Using  the  induction  hypothesis  and  the  results  established  above  we  then  have 

(nJ6j(/>,//j1C;, /?_,))  bL  P 

la  e  (n^jW) 

Hj,  Cj,  Rj))  |bme  N  ■ 

KjejNj  b  (n juHjW) 


and  the  desired 


(n teAIj'Hi'Cj.Ri))  bme  Nr[P] 


then  follows.  This  concludes  the  proof. 


□ 


3.3  Algorithmic  properties 

We  now  show  how  to  compute  in  polynomial  time  the  least  solutions  guaranteed 
by  Theorem  2.  We  shall  proceed  in  three  stages.  In  the  first  we  generate  a  set  of 
master  constraints.  In  the  second  we  expand  the  master  constraints  to  a  larger  set  of 
conditional  constraints.  Finally,  we  solve  the  conditional  constraints  in  polynomial 
time.  As  mentioned  earlier  we  may  without  loss  of  generality  assume  that  Lab  U 
Bnd  U  SNam  is  finite. 
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Master  constraints.  To  generate  the  master  constraints  we  use  four  functions.  The 
auxiliary  functions  VM  and  VN  of  Table  7  extract  the  “succinct  results”  produced 
by  the  analysis  of  capabilities  and  namings;  these  are  the  analogues  of  the  sets  M 
and  N  and  here  take  the  forms  {mi,  •  •  •  ,m&}  and  {n},  where 

m  ::=  fin1’  ft  |  fout'°J  |  fopen'” J  |  Rc(/3C) 

n  |j>}  I  M/?n) 

and  /3C  £  Bndc,  \i  G  SNam  and  (3n  £  Bndn.  To  make  it  clear  that  we  are  now 
talking  about  syntax  we  write  {**■])•  instead  of  {•  •  •}  and  Rn  instead  of  Rn  etc. 
Intuitively,  the  minimal  M  such  that  (/,  H,  C,  .ft)  |>me  M  :  M  is  given  by  |J{m  | 
m  G  VM[M]me}  and  the  minimal  N  such  that  (/,  H,  C ,  R)  |^me  N  :  N  is  given  by 
\J{n  |  n  G  VN[AT]me};  we  shall  be  slightly  more  precise  in  Lemma  3  below.  (There 
is  no  need  for  an  analogous  function  VP  for  processes  since  they  do  not  have  a 
“succinct”  component.) 

The  constraint  generation  functions  CP  and  CM  of  Tables  6  and  7  generate  master 
constraints  for  processes  and  capabilities,  respectively.  Master  constraints  take  the 
following  rather  permissive  forms: 

m  s  m 
in  q  i  (n 

n  C  H(0 

in*  and  similarly  for  out*  and  open* 

Vi  6  Lab*  :  {in* }  C  Kc(/3C)  =>  {J{  C  l(la)  and  similarly  for  out*  and  open* 

Vi  G  Lab*  :  {in*{  C  Rc(/?C)  =>-^#  in*  and  similarly  for  out*  and  open* 

Vi  G  I_1(ic)  :  m  C  Cc(i)  and  similarly  for  in,  n  and  Cn 

Vi  G  I-1(/?c)  :  Cc(i)  C  Rc (f3c)  and  similarly  for  /3n,  Cn  and  Rn 

The  constraint  in*"  intuitively  stands  for  ^*  in*  of  Table  5  but  without  the 
condition  l*  G  1(1),  i.e. 

Via  G  I’1  (l*)  :  V/i  G  Hft)  :  VK  G  I~l{la)  : 

Vla"  G  /(ia#)  D  H-'ifi)  H  Laba  : 
i°  G  I(la") 

and  similarly  for  out*°  and  open*P.  (There  is  no  need  for  a  constraint  gener¬ 
ation  function  CN  for  namings.) 

We  shall  dispense  with  formally  defining  what  it  means  for  a  control  flow  estimate 
(/,  H ,  C,  R)  to  satisfy  a  set  of  master  constraints  since  the  idea  is  clear:  each  con¬ 
straint  must  be  fulfilled  when  interpreting  the  formula  with  {■■•}  for  {•  •  and 
Rn(-  •  •)  for  Rn(*  •  ■)  etc.  We  shall  write  (/,#,  (7,77)  (=  C  when  (/,#, (7, R)  satisfies 
a  set  C  of  constraints  and  [raj  (i,h,c,R)  f°r  the  value  of  rn  under  the  interpretation 
(J,tf,C,i?)  and  similarly  for  It  is  then  straightforward  to  prove  that 

the  master  constraints  generated  by  Tables  6  and  7  faithfully  model  the  acceptabil¬ 
ity  relation  of  Tables  4  and  5. 

Lemma  3.  (J,  H ,  C,  R)  \=lme  P  if  and  only  if  (/,  H ,  C,  R)  |=  CP[ P]lme. 


Proof.  The  proof  is  by  structural  induction  establishing  also  the  following  results: 
(/,  H,  C,  R)  INme  N  :  N  iff  Vn  G  VN[7VJme  :  [n](/>/*,CiR)  Q  & 

Vra  G  VM[M]me  :  [™J(j,H,C\fl)  £  M  A 


{I,H,C,R)  I >meM:M  iff 
We  dispense  with  the  details. 


(I,H,C,R)\=  CM[M]n 
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CP  [(«/n")P]L 

CP|0]L 

=  0 

cp|p  |  P'lL 

=  CP [P]L  u  CPIP'JL 

CP[!P]me 

=  cp[piL 

CP|fV'°  [P]l‘me 

=  cp[p}L  u  {{f -j  c  i(f)}  u 

(nCH(fa)  |nG  VN[W]me} 

CP[M.P]L 

=  CP[Pj'm.  U  CMJMJme  U 

if  m  is  {in1*}  then  {^*  in'\f/'}  C  1(f)}  else 
if  to  is  {out10}  then  {^*  out'°,{f°}  C  1(f)}  < 
if  to  is  {open1’’}  then  {^*  open1’’,{fp}  C  I(f 

'Vf1Gtab1:{in,‘}Cm=»|=*  in'\ 

Vf1  6  Lab1  :  fin1'}  C  to  =»  ff *}  C  1(f), 

<  Vf°  G  Lab°  :  {out'0}  C  to  =>t=*  out'°, 

Vf°  G  Lab°  :  {out1"}  C  m  =*•  ff°|  C  1(f), 
Vfp  G  Labp  :  {open1”}  C  to  =>js*  open'”, 

,  Vfp  G  Labp  :  {open1’’}  C  to  =>  {<p}  C  1(f) 

CPl{M),clLc  =  {{fc}  C  1(f)}  U  CM[Af]me  U 

{V! “  G  I_1(fc)  :mCCc(fa)  |rnG  VM[M}me} 

CPim‘n]L  ={{<"}  CI(i)}  U 

{Vf“  G  I_1(fn)  :  n  C  Cn(f“)  |  n  G  VN[JV]me} 

CP|(x^).Pll».  =  CPIPSL^I  u  {{/3C}  C  1(f)}  U 
{Vf“  e  I_l(/3C) :  Cc(n  C  Rc(/3C)} 

CPK^J.Pl^^CPIPlLiu^n,  U  {{/3"}CI(f )}  U 
{Vf "  G  I_1(/3n)  :  Cn(f°)  C  Rn(/3”)} 

Table  6.  Constraint  generation  (for  processes). 


Example  5.  The  set  CP  [Firewall  \  Agent}^Ct  of  master  contraints  for  the  program 
n1'  [Firewall  \  Agent]  of  Example  1  consists  of  the  following  master  constraints 
(assuming  that  me*  is  an  in  Example  3  and  ignoring  the  subprocesses  P  and  Q ): 


fiAJ  C  1(1*) 

C  H(A) 

m  £  1(A) 

W  £  H(B) 

fils  1(B) 

out1 

M  £  H(l) 

m  £  i(b) 

in2 

P'l  £  H(2) 

m  c  i(b) 

in3 

(M  C  H(3) 

M  £  1(A) 

open4 

|*'t  £  H(4) 

«5}  C  1(A) 

open5 

{k"}  C  H(5) 

fCJ  C  I(i*) 

•ffc'J  £  H(C) 

fi6}  C  1(C) 

open6 

|*J  £  H(6) 

m  e  i(c) 

{A:"}  C  H(D) 
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CMpr/jV]]™ 

=  {SCH(r)  |n£  VN[NJme} 

CM[out*°JV[me 

=  {n  C  H(i°)  |  n  e  VNliVlme} 

CM[open|PjVjme 

=  {n  C  H(JP)  |  n  €  VN[JVjme} 

CM[l]me 

=  0 

CM  [e]|  me 

=  0 

CM[Mi.M2]me 

=  CM[MlJme  U  CM[M2lme 

VMIir/itTU 

= 

VM[out,c'Arlmo 

=  {fout!°}} 

VM[open‘'iVlme 

=  {fopen|P  J} 

VM|i]me 

=  (Rc(me(a:))} 

VM|e]me 

=  0 

VM[Mi.Af2lme 

=  U  VM[ M2]me 

VN[«lme 

II 

r**-\ 

f=*=^ 

2 

VN  [lij  me 

=  (Rn(me(u))} 

Table  7.  Constraint  generation  (for  capabilities  and  namings). 


It  is  straightforward  to  check  that  the  analysis  estimate  ( J,  H,  C ,  R )  of  Example  3 
indeed  satisfies  these  master  constraints.  □ 

Suppose  that  the  program  is  of  size  p  >  1  and  that  the  size  of  the  finite  set  Lab  U 
BndUSNam  is  q  >  1.  It  is  natural  to  assume  that  q  =  0(p)  as  when  LabUBndU 
SNam  is  a  finite  set  consisting  only  of  the  entities  used  in  the  program;  however, 
we  shall  allow  to  let  q  be  less  than  p  so  as  to  trade  precision  for  efficiency. 

Note  that  VN[iV]me  is  always  a  singleton  and  that  CM[Mjme  and  VM[Mjme 
contain  no  more  elements  than  corresponds  to  the  size  of  M.  It  is  then  immediate 
that  constraint  generation  operates  in  time  0(p)  and  that  it  produces  0(p)  master 
constraints  each  of  size  0(1).  —  Since  q  may  be  much  smaller  than  p  it  will  be  useful 
to  observe  that  the  number  of  constraints  can  also  be  given  as  0(q2)]  to  see  this 
simply  note  that  each  master  constraint  contains  at  most  two  “free”  symbols  from 
Lab  U  Bnd  U  SNam.  In  this  case  constraint  generation  time  should  be  estimated 
as  0(p  +  q2). 

Conditional  constraints.  The  next  stage  is  to  expand  the  master  constraints  into 
sets  of  conditional  constraints  that  do  not  involve  quantifiers  and  that  do  not  rely  on 
the  ■  abbreviations  adapted  from  Table  5.  The  general  syntax  of  conditional 
constraints  is  based  on  constants  (denoted  set))  variables  (denoted  var ),  conditions 
(denoted  cond )  and  constraints  (denoted  constr ): 

set  ::=  {1}  \  {0}  |  M  I  fin'!  I  {out'}  |  fopen'} 

var  I (l)  |  H(l)  |  Cc (l)  \  Cn (l)  |  Rc  (f3)  |  Rn(/2)  |  •  •  • 

cond  set  C  var 

constr  cond  \  cond  constr 
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To  transform  master  constraints  into  conditional  constraints  we  perform  the  follow¬ 
ing  operations: 

-  expand  var i  C  var2  into  Vsef  :  set  C  varx  =>  set  C  var2\ 

-  unfold  the  definition  of  •  •  •; 

-  move  quantifiers  outermost; 

-  eliminate  quantifiers  by  instantiating  the  bodies  with  all  possible  labels; 

-  eliminate  all  “-1”  operations  by  changing  {arj  C  I~l(y)  to  C  J(x). 

We  illustrate  the  development  for  the  following  master  constraint: 

Vi'  £  Lab*  :  fin1'  ft  C  Rc(/3C)  ^  in1'  (2) 

Straightforward  application  of  the  above  operations  gives  rise  to  the  following  set 
of  conditional  constraints: 


({in'  ft  C  Rc(/?C))  => 

(m^m)  =* 

l '  €  Lab', 

(MCH(i'))  =► 

({«“}  C  I({“'))  =» 

Z°  6  Lab°, 
fi  6  SNam,  > 
t°'  6  Lab°, 

(«ia"}ci(K))  =* 

( M  C  H(Z°"))  => 

Z“"  €  Lab° 

4 

It  turns  out  that  this  suffices  for  obtaining  a  polynomial  time  algorithm  for  comput¬ 
ing  the  least  solution;  however,  in  the  interest  of  obtaining  a  qubic  time  algorithm  we 
shall  “tile”  the  conditional  constraints  in  the  manner  of  [20].  To  do  so  we  introduce 
three  auxiliary  relations: 


-  {Pj  C  IN(o)  meaning  that  in*’  has  been  found  to  be  “active"; 

“  }  Q  NAM(ll)  meaning  that  P  and  la "  have  the  same  name; 

-  &la}  Q  SIB(la”)  meaning  that  la  and  la"  are  siblings. 


(As  will  be  shown  below,  their  values  are  all  derived  from  the  estimate  (/,  H,  C,  R) 
under  consideration.) 

The  master  constraint  (2)  then  gives  rise  to  a  set  of  “local”  constraints  and  three 
sets  of  “global”  constraints  that  are  shared  for  all  values  of  (3C  occurring  in  (2).  The 
set  of  “local”  constraints,  generated  for  each  occurrence  of  (2),  computes  the  IN 
relation  from  (/,  H ,  C,  R ): 

{  ({in'*}  C  Rc03c))  =*  ({ i'ft  C  JN(o))  \l*  e  Lab’  } 

One  of  the  sets  of  “global”  constraints,  shared  for  all  occurrences  of  (2),  computes 
the  NAM  relation  from  (/,  if,  C,  R): 


f  (M  £  H(I*))  =» 

{  (Mch (K))  => 
(  lla"}  c  NAM(P) 


P  G  Lab\  1 
\i  G  SNam,  > 
la"  G  Laba  J 


The  other  set  of  global  constraints,  shared  for  all  occurrences  of  (2),  computes  the 
SIB  relation  from  (/,  if,  <7,  R): 


(Mci(K)) 

<  (la’)) 

C  SIB(la" 


=>  la  G  Laba,  1 
=>  la '  G  Laba,  \ 
/a"  G  Laba 
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Type  of  Master  Constraint 

^Instances 

#  “Local” 

#  “Global” 

m  c  i(o 

1/3}  c  1(0 

n  C  H(i) 
in1  etc. 

Vi  6  Lab*  :  {in'}  C  Rc(/3C)  =S-  {i}  C  1(0  etc. 
Vi  e  Lab*  :  {in'}  C  Rc(/3C)  =*|=*  in'  etc. 

Vi  €  I-1(ic)  :  to  Q  Cc(i)  etc. 

Vi  6  I-1(/3C)  :  Cc(i)  C  Rc(/?C)  etc. 

0(p)k0(q2) 

0(p)kO(q2) 

0(p)k0(q2) 

0(p)kO(q) 

0(p)kO(q2) 

0(p)kO(q) 

0(p)k0(q2) 

0(p)k0(q) 

0(1) 

0(1) 

0(q) 

0(1) 

0(9) 

0(q) 

O(q) 

0(92) 

0(1) 

0(1) 

0(1) 

0(93) 

0(1) 

0(q3) 

0(q3) 

0(1) 

Table  8.  Prom  master  constraints  to  conditional  constraints. 


The  final  set  of  global  constraints,  shared  for  all  occurrences  of  (2),  performs  the 
update  of  7: 


(fZ*}CIN(o))  ^ 

> 

(OT  c  I(i“))  =► 

P  €  Lab*, 

({/“!  c  SIB(1‘"))  ^ 

lae  Laba, 

({/«"}  C  NAM(l'))  => 

P"  e  Lab° 

We  proceed  in  a  similar  way  for  the  other  master  constraints  and  write  C{Pllme  for 
the  set  of  conditional  constraints  obtained  by  expanding  CP[PJ^e.  Once  more  we 
dispense  with  formally  defining  what  it  means  for  a  control  flow  estimate  (7,  H,C,R) 
to  satisfy  a  set  of  conditional  constraints  since  the  idea  is  clear:  the  auxiliary  rela¬ 
tions  (IN,  NAM,  etc.)  should  be  given  as  the  least  relations  satisfying  their  defining 
constraints.  In  analogy  with  Lemma  3  we  then  have  the  following  result: 

Lemma  4.  (7,  H ,  C,  R)  |=  C{Pfme  if  and  only  if  (7,  P,  C,  R)  H=™e  P. 

Table  8  shows  for  each  form  of  master  constraint  how  may  instances  are  generated, 
how  many  “local”  conditional  constraints  are  generated  by  expanding  each  occur¬ 
rence  of  a  master  constraint,  and  how  many  “global”  conditional  constraints  are 
shared  for  all  master  constraints  of  the  form  considered.  The  number  of  master  con¬ 
straints  is  generally  written  as  0(p)k0(q2)  to  indicate  that  both  bounds  0(p)  and 
0(q2)  apply  as  discussed  previously;  also  note  that  only  0(p)kO(q)  master  con¬ 
straints  involve  •  since  the  relevant  master  constraints  only  contain  one  “free” 
symbol  from  Lab  U  Bnd  U  SNam  and  similarly  for  two  other  types  of  master  con¬ 
straints.  In  the  example  treated  in  detail,  0(q)  “local”  conditional  constraints  are 
produced  for  each  occurrence  of  a  master  constraint  and  0(q3)  “global”  conditional 
constraints  are  shared;  without  the  use  of  “tiling”  [20]  we  would  have  generated 
0(g5)  constraints  for  each  occurrence  of  a  master  constraint. 

Since  q  =  0(p)  it  follows  that  constraint  generation  operates  in  time  0(p3)  pro¬ 
ducing  0(p3)  conditional  constraints  of  size  0(1).  —  It  is  useful  to  observe  that 
constraint  generation  can  also  be  estimated  as  0(p  +  q3)  steps  for  generating  0(q3) 
conditional  constraints  of  size  0(1). 


Constraint  solving.  We  now  consider  how  to  turn  C [P]lme  into  the  smallest  accept¬ 
able  analysis  estimate  (7,  H,  C,  R).  Perhaps  the  simplest  approach  is  to  use  a  Round 
Robin  algorithm  (see  e.g.  [17,  Chapter  6]).  A  more  efficient  approach  is  based  on 
worklist  algorithms  (see  e.g.  [17,  Chapter  6])  and  makes  sure  to  consider  constraints 
only  when  they  are  likely  to  have  been  enabled  due  to  recent  changes.  This  is  the  key 
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idea  behind  the  approach  of  [11]  and  allows  us  to  solve  the  conditional  constraints 
in  c[p]L  in  time  proportional  to  their  size. 

Theorem  3  (Theorem  2  of  [20]).  The  least  control  flow  estimate, 
n{(I,H,C,R)\(I,H,C,R)  hL  P}, 
can  be  computed  in  cubic  time. 

Proof.  Thanks  to  Lemmas  3  and  4  we  have 

n{(7,  H, C, R)  |  (7, 77,  C, R)  (=‘mc  P]  =  n{(7, 77,  C, R)  |  (7, 77, C,  R)  1=  CfPlL} 

and  we  already  established  that  CfPj^  has  size  0(p3)&;0(q3)  and  can  be  generated 
in  time  0(p3)Ac0(p  +  g3).  Thanks  to  the  techniques  of  [11]  the  least  solution  can  be 
found  in  time  proportional  to  the  size  of  the  constraint  system. 

A  constructive  algorithm  for  computing  the  least  solution  can  be  found  in  Appendix 
A.  Theorem  3  is  attributed  to  [20]  that  developed  “tiling”  to  establish  a  similar  result 
for  the  communication-free  fragment  of  the  ambient  calculus;  no  new  complications 
were  encountered  when  dealing  with  communication.  □ 

The  cubic  time  bound  is  satisfying  since  this  is  the  complexity  of  control  flow  anal¬ 
yses  also  for  functional  and  object  oriented  languages  where  control  flow  analysis  is 
used  with  success  also  on  “medium-sized”  programs  (between  10K  and  100K  lines  of 
code).  Furthermore,  since  q  can  be  chosen  arbitrarily  small  (although  at  the  cost  of 
precision)  this  gives  confidence  in  exploring  our  current  technology  on  fully  realistic 
internet  programs. 


4  Validating  Firewalls 

In  the  examples  we  have  studied  a  notion  of  firewall  given  by  the  passwords  k,  k' 
and  k"  used  for  entering  it.  One  aspect  of  being  a  firewall  is  that  agents  in  the 
approved  form  must  be  allowed  to  enter.  For  the  firewall  proposed  in  Example  1  the 
approved  form  is  k'C[open6k.  k"D[Q]]  and  in  Example  2  we  showed  that  agents  in  this 
form  can  indeed  enter  the  firewall:  Firewall  |  Agent  (i/w™)wA[P  |  Q]  (assuming 
that  w  ^  £n(Q)).  It  is  shown  in  [7]  that  this  can  be  strengthened  to  establish  that 
Firewall  \  Agent  is  observationally  equivalent  to  (i/w™)wA[P  |  Q]  (assuming  that 
fri(^)  n  {k,k',k"}  =  0  =  fn(Q)  n  {v,k,k',k"}). 

Another  aspect  of  being  a  firewall,  not  dealt  with  in  [7],  is  to  ensure  that  processes 
not  knowing  the  right  passwords  cannot  enter.  Due  to  the  power  of  the  ambient 
calculus  this  is  not  as  trivial  as  it  might  appear  at  first  sight.  As  an  example,  a 
process  that  does  not  initially  know  the  passwords  might  nonetheless  learn  them  by 
other  means.  As  another  example,  the  firewall  might  contain  a  trapdoor  through 
which  processes  might  be  able  to  enter  (see  Example  6  below). 

Intuitively,  we  define  a  process  (or  attacker)  U  to  be  unaware  whenever  fn(U)  D 
{^k^k"}  =  0.  We  then  define  a  proposed  firewall  to  be  protective  whenever  the 
semantics  of  Section  2  prevents  it  from  allowing  any  unaware  process  to  enter. 

Example  6.  Consider  the  proposed  firewall 

Firewall'  :  (i/wu,)wA[kB[out1w.  in2k'.  in3v]  |  open4k'.  open5k " ,P 
|  tE[out7w.  in8v.open9q]  |  open10t] 
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that  additionally  contains  a  trapdoor  t.  It  is  easy  to  check  that 
Firewall '  |  Agent  — ►*  (i/w™)wA[-  •  •  |  P  \  Q] 

using  Agent  of  Example  1  (assuming  that  w  fn (Q)).  But  now  the  unaware  process 
qF[innt.  Q]  can  also  enter  as  is  shown  by 

Firewall'  |  qF(inut.  Q]  -+*  («/»*)■*[•  •  •  |  P  \  Q] 

(again  assuming  that  w  £  fn(Q))  unlike  what  was  intended.  This  means  that 
Firewall'  is  not  a  protective  firewall  because  it  can  be  entered  by  a  process  not 
knowing  the  right  passwords.  □ 

In  the  development  below  we  focus  on  one  particular  interface  for  firewalls,  as  given 
by  the  three  passwords  and  formats  shown  above,  but  the  development  can  be 
adapted  to  other  interfaces  as  well.  With  respect  to  the  chosen  interface  we  then 
aim  at  developing  a  safe  test  for  when  a  proposed  realisation,  e.g.  Firewall '  or 
Firewall ,  can  be  proved  to  be  protective  (i.e.  to  live  up  to  the  expectations). 

We  proceed  by  devising  a  test  based  on  the  control  flow  analysis;  to  facilitate  this 
we  need  to  formalise  the  notions  of  “proposed  firewall”  and  “unaware  process”  (or 
unaware  attacker)  in  the  terminology  of  the  control  flow  analysis.  In  essense  this 
amounts  to  shifting  our  attention  from  names  to  stable  names. 

A  proposed  firewall  is  specified  by  a  tuple  of  the  form 
(me*,  ({v  w™)wa[F]),  k,  k\  k") 

such  that  (me*,ni*[(i/wli,)wA[F]])  is  a  program.  As  in  the  examples  we  assume  that 
there  are  three  passwords  but  the  development  can  easily  be  adapted  to  an  arbitrary 
selection  of  passwords. 

To  enable  the  proposed  firewall  to  pass  the  test  to  be  developed,  it  will  be  helpful 
to  arrange  that  the  naming  environment  respects  the  privacy  of  the  name  of  the 
firewall,  i.e.  me~1(w)  —  0,  that  the  naming  environment  respects  the  uniqueness 
of  the  passwords,  i.e.  me^1(k)  —  {k},  me~1{kl)  —  {k'}  and  me~l{kn)  =  {k"},  and 
that  the  process  F  does  not  contain  any  of  the  stable  names  w,k,k '  or  although 
this  is  not  formally  required. 

For  technical  reasons  we  shall  arrange  that  the  the  proposed  firewall  does  not  contain 
any  distinguished  symbols  (in  particular  those  marked  “•”)  and  that  the  naming 
environment  does  not  map  any  names  to  the  distinguished  stable  name  //. ;  this  can 
always  be  achieved  by  adjusting  the  choice  of  distinguished  symbols  and  by  Fact  2 
this  has  no  semantic  consequences. 

An  unaware  attacker  (relative  to  the  form  of  proposed  firewall  considered  here)  is 
a  process  U  such  that 

(rae*,n**[I7])  is  a  program,  and 

no  free  name  in  U  is  mapped  to  any  of  the  “private”  stable  names,  i.e. 
{me*(n)  |  n  6  fn(C/)}  n  {w,  k,  /c',  k"}  =  0. 

When  the  above  recommendations  are  adhered  to,  the  second  condition  follows  from 
the  assumption  that  none  of  the  passwords  k,  k'  or  k"  are  free  in  U.  Note  that  the 
first  and  second  condition  together  establish  the  following  stronger  version  of  the 
second  condition:  {me*(n)  |  n  €  fn (U)}  fl  {^t*,  w ,  fc,  fc',  k"}  =  0;  we  shall  exploit  this 
fact  in  the  proof  of  Proposition  1. 
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input:  a  proposed  firewall  (me*,  ((1/ v™)vA[F]),  k,  k\  k") 
without  distinguished  symbols  in  wA[F] 

OUTPUT:  “accept”  or  “reject” 


METHOD:  let  =  {n  G  dom(me*)  |  me*(n)  £  {/i*,  w,  fc,  A:',  fc"}} 

let  n  £  dom(roe*)  and  x  G  Varc  and  it  G  Var" 


construct  U *  — 


^ (in'in)1*  |  (out^n)1*  |  (open1*?!)1*  |  \ 

(**).(<*>*  |*.0)| 

«n»'*  i  «ni»'*  i  •  •  •  i  i 

\  ((u(3*  ))•  (((u))1*  |  u1*  [0]  |  in1* it  |  outfit  |  open1* it)  J 


construct  U*  =  \{vn»'){U:  |  n1*^**]) 
compute  the  least  (/,  77,  C,  7?)  such  that 

(7,77,C,7?)  (=^e#  ((i/ww)wa[F])  |  U * 
if  3  F  G  Laba :  £  G  7+(F)  A  iu  G  77(F) 
then  “reject” 
else  “accept” 


Table  9.  Testing  for  protectiveness. 


To  develop  a  sound  test  for  validating  the  protectiveness  of  a  proposed  firewall  (see 
Table  9)  we  proceed  in  two  stages.  First  recall  that  we  defined  a  proposed  firewall  to 
be  protective  whenever  the  semantics  prevents  any  unaware  process  from  entering. 
The  first  stage  of  the  development  then  is  to  define  a  related  notion  where  the  control 
flow  analysis  plays  the  role  of  the  semantics:  a  firewall  is  strongly  protective  whenever 
the  control  flow  analysis  is  able  to  demonstrate  that  no  unaware  process  can  enter 
the  firewall.  It  follows  from  the  correctness  of  the  control  flow  analysis  (Theorem  1) 
that  a  strongly  protective  firewall  is  also  protective.  Since  the  control  flow  analysis 
is  approximate  it  would  be  unlikely  for  the  converse  result  to  hold;  however,  we 
shall  see  that  Firewall  is  both  protective  and  strongly  protective  whereas  Firewall 1 
is  neither. 

This  then  leads  to  the  following  interim  suggestion  for  testing  the  strong  protec¬ 
tiveness  of  a  proposed  firewall  (me*,  ((v ww)wA[F]),  k ,  k”)\ 


if  there  exists  an  unaware  attacker  U  such  that 

for  the  least  (7,77,C,7?)  satisfying  (7,77,C,F)  |=|*e#  ((^wu,)wA[F])  |  U 
there  exists  G  Lab“  labelling  an  entity  in  U  and  F  G  LabQ 
such  that  1%  G  J+(F)  A  tu  G  77(F) 
then  “reject”  else  “accept" 


While  this  test  is  sound  it  involves  a  search  over  an  infinity  of  unaware  attackers  and 
so  is  not  readily  implement  able.  The  second  stage  of  the  development  therefore  is 
to  restrict  the  search  to  a  finite  set  of  unaware  attackers  that  are  as  hard  to  protect 
against  as  any  other  unaware  attackers;  in  our  case  a  single  unaware  attacker  will 
do  and  it  is  called  the  hardest  attacker .  It  amounts  to  the  process  [/*  of  Table  9. 
Clearly  the  hardest  attacker  should  have  access  to  all  stable  names  except  those 
of  indeed,  we  ensure  that  only  the  stable  name  /z.  is  introduced 

internally.  In  a  similar  way  we  ensure  that  only  the  binders  /?J  and  f3 ?  and  only  the 
distinguished  labels  /?,  F#,  Ij,  lcm  and  are  used  internally.  Over  this  universe  of 
names,  binders  and  labels,  the  hardest  attacker  must  be  able  to  perform  all  outputs 
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of  names,  to  create  ambients  and  capabilities  with  the  required  names,  to  output 
and  input  all  kinds  of  capabilities  and  to  enact  all  relevant  capabilities. 

It  is  hardly  immediate  that  £/*  of  Table  9  lives  up  to  these  expectations.  However, 
we  are  developing  a  sound  test  for  strong  protectiveness  and  so  can  restrict  our 
attention  to  the  level  of  granularity  embodied  in  the  control  flow  analysis.  This 
allows  us  to  prove  in  Proposition  1  below  that  the  definition  of  U*  is  sufficiently 
general  to  capture  the  behaviour  of  all  unaware  attackers  —  as  far  as  the  control 
flow  analysis  is  concerned. 

Example  7.  To  test  Firewall  from  Example  1  and  Firewall '  from  Example  6  we  need 
to  be  more  precise  about  the  subprocess  F;  in  our  tests  we  have  used 

!p[in  p  |  out  p  |  open  p  |  p[0]] 

(omitting  labels)  as  an  example  of  an  unrestricted  internal  process.  Then  Firewall 
passes  the  test  because  H~1(w)  =  {A}  and  l J  0  J"*”(A)  but  Firewall r  fails  the  test 
because  H”1(w)  =  {A}  and  ZJ  €  J+(A).  □ 

As  explained  above  the  correctness  of  the  test  hinges  on  the  following  key  result;  it 
shows  that,  from  the  point  of  view  of  the  analysis,  it  is  as  hard  to  protect  a  firewall 
against  the  process  Z7*  of  Table  9  as  it  is  to  protect  it  against  any  other  unaware 
process  U.  The  formulation  uses  the  operation  [U\  also  used  to  express  Fact  2: 
all  stable  names,  binders  and  labels  are  replaced  by  the  appropriate  distinguished 
stable  names,  binders  and  labels. 

Proposition  1.  Let  (me*,  ((i/w™)wa[F]),  fc,  k* ,  fc")  be  a  proposed  firewall  as  de¬ 
manded  in  Table  9  and  let  (/,  H ,  C,  R)  be  as  in  Table  9.  Then 

(I,H,C,R)  |=k.  ((^w»)wa[F])  |  [U\ 

whenever  U  is  an  unaware  attacker. 

Proof.  The  proof  is  in  two  parts.  The  first  draws  a  number  of  consequences  from 
the  fact  that  {I,H,  C,  R)  [=^  t/*  and  the  second  proves  a  number  of  “general” 
results  from  which  (J,H,  C,  R)  Hme*  L^J  immediately  follows. 


Part  1.  For  the  first  part  we  unfold  the  formula  defining  (/,iJ,  C,  R)  |=me* 
Since  U*  appears  outermost  as  well  as  inside  nl •  [•  •  •]  we  get: 

/(<*)2{i  (3) 

/K)2  0:,i 

In  the  first  two  lines  of  Z7*  we  output  all  three  kinds  of  capabilities  and  then  in¬ 
put  them  again;  similarly  in  the  last  two  lines  we  output  all  the  stable  names 
//,,me*(ni),  •  •  •  ,me*(nm)  accessible  to  Z7*  and  then  input  them  again;  this  yields: 

Rc(Pi)  5  {in1* ,  out1* ,  open1* }  ^ 

Rn{P?)  2  {/■*., me*(ni) ,■  ■  ■  ,me*(nm)} 

More  generally,  the  fact  that  we  input  what  has  been  output,  ensures  that  the 
clauses  for  input  of  Table  4  establish  the  following  containments: 

vza  €  rl(P?) '  cn(ia)  c  Rn(l3?) 
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In  a  similar  way  we  also  make  sure  to  output  what  has  just  been  input  and  the 
clauses  for  output  of  Table  4  then  establish  the  “dual”  containments: 

VP  e  r1#*)  :  Cc(la)  2  RC(PC.) 
Vlaerl(l?):Cn(la)DRn(0?)  (6) 

In  the  second  line  of  U*  we  perform  the  capability  just  input;  since  this  takes  place 
both  outermost  and  inside  n**  [•  •  •],  the  clause  for  capabilities  gives: 

Vm  €  Rc(pc.)  :  V/  €  {/*, /?}  :  (/, H, C,  R)  |=*  rh  (7) 

since  the  M  used  for  validating  x.O  must  satisfy  M  D  RC(P J).  Similarly,  in  the 
fourth  line  of  U*  we  construct  ambients  and  capabilities  with  the  names  just  input; 
this  yields: 

Rnm  c  H(ia.) 

RnmcH(ii) 

Rn(p #n)  C  H{ll)  W 

Rn(P?)  C  H(K) 

since  the  TV  used  for  validating  u*‘[0]  must  satisfy  N  D  Rn(P?)  and  similarly  for 
the  three  capabilities. 


Part  2.  We  now  consider  the  second  part  of  the  proof.  We  shall  say  that  a  naming 
environment  me  is  acceptable  for  the  process  P  whenever  it  defines  all  names  and 
variables  in  P,  i.e.  fv(P)  U  fn(P)  C  dom(me),  and  whenever  it  only  maps  variables 
and  names  to  “acceptable”  symbols: 

range  (me)  C  *  *  • ,  me*(nm), 

In  a  similar  way  we  define  the  acceptability  of  me  for  a  capability  M  and  a  naming 
N. 

First  we  prove  for  all  me  and  N  that 

(I,H,C,R)  \WmeN:R-(p:) 

provided  me  is  acceptable  for  N  \y) 

where  (/,#,  C,  R)  is  as  above.  The  proof  is  by  inspection  of  the  two  cases  for  N. 
The  case  n  is  immediate  given  (4)  and  the  acceptability  of  me.  The  case  u  is  trivial. 

Next  we  prove  for  all  me  and  M  that 

(IyHyCyR)  [>me  [MJ  :  RC(P%) 

provided  me  is  acceptable  for  M  '  ' 

where  again  (/,P, C, R)  is  as  above.  The  proof  is  by  structural  induction  in  [MJ. 
In  the  cases  in** TV,  out** TV  and  open** TV  we  take  TV  =  Rn(P?)\  the  result  is  then 
immediate  from  (9),  (4)  and  (8).  The  cases  x  and  e  are  trivial  and  the  case  Mi. M2 
is  immediate  from  the  induction  hypothesis. 

Finally  we  prove  for  all  me  and  P  that 

whenever  me  is  acceptable  for  P  '  ' 

and  where  (/,  if,  C,  R)  once  more  is  as  above.  The  proof  is  by  structural  induction  in 
[PJ.  The  case  (i/nM*)[PJ  follows  from  the  induction  hypothesis  (since  me[n  »->  //,] 
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is  acceptable  for  |PJ).  The  cases  0,  [P\  I  LP'J  and  !LfJ  are  immediate  using  the 
induction  hypothesis.  In  the  case  iV*-[|PJ]  we  choose  N  =  Rn(/3?)\  the  result  then 
is  a  consequence  of  the  induction  hypothesis,  (3),  (9)  and  (8).  In  the  case  LMJ*  L-^J 
we  take  M  =  RC(PZ)\  the  result  then  follows  from  the  induction  hypothesis,  (10) 
and  (7).  In  the  case  for  (M)1*  we  take  M  =  RC(PZ)\  the  result  then  is  a  consequence 
of  (3),  (10)  and  (6).  In  the  case  for  {(N))1*  we  take  N  =  Rn(/3?)\  the  result  then 
follows  from  (3),  (9)  and  (6).  The  case  ( x #).  [P\  is  a  consequence  of  the  induction 
hypothesis  (since  me[x  PI]  is  acceptable),  (3)  and  (5).  The  case  (( u *•)).  [P\ 
follows  from  the  induction  hypothesis  (since  me[u  p*\  is  acceptable),  (3)  and 
(5). 

Returning  to  the  unaware  process  U  we  observe  that 

{me*(n)  |  n  €  fn(£/)}  C  {me*(rai),  ■  •  •  ,me*(nw)} 

and  it  then  follows  from  (11)  and  Fact  4  that  me*  can  be  restricted  so  as  to  be 
acceptable  for  U\  this  gives  (7,if,C,  R)  (=me*  L^J  as  desired.  □ 

When  (is  vw)\nA[F]  passes  the  test  and  U  is  an  unaware  process  we  want  to  show 
that  no  subambient  of  U  ever  passes  inside  w.  Informally,  this  will  take  the  form 
of  assuming  that  ((is  ww)wA[F])  |  U  ->*  S  and  guaranteeing  that  S  contains  no 
subambient  wz“  [•  •  •  [•••]•  •  •]  where  n  comes  from  U.  To  formalise  this  we  shall 

avail  ourselves  of  Fact  2  that  allows  us  to  arrange  the  labelling  to  suit  our  needs. 
Indeed,  if  ((is w™)wA[F])  |  U  -+*  S  then  ((isvw)va[F))  \  [U\  ->*  S'  for  some  5'  such 
that  [S\  =  [S'\. 

Theorem  4.  Suppose  that  (i/ww)wA[F]  passes  the  test  of  Table  9  and  that  U  is 
an  unaware  process;  if  ((is uw)uA[F})  \  jVj  ->*  S  then  S  contains  no  subterm 
ni  [‘  •* «!?  [■*•]•  *  *1  where  ni  has  stable  name  w  and  / £  is 

Proof.  Letting  (I,H:C,R)  be  as  in  Table  9,  it  follows  from  Proposition  1  that 
(I,H,C,R)  ((i/w-)wA[F])  I  [U J .  Then  (/,  if,  C,  P)  |=fee*  S  follows  from  The¬ 

orem  1.  Suppose  next,  for  the  sake  of  contradiction,  that  S  does  contain  a  subprocess 
ni  [*  *  •  [•••]*•*]  where  m  has  stable  name  w  and  Zg  is  ZJ.  Then  it  follows  from 

(J,i7,C,  R)  hme*  s  that  C  e  A  w  G  H(l%)  showing  that  the  test  could  not 

have  been  passed.  LI 

Theorem  5.  The  test  in  Table  9  operates  in  cubic  time. 

Proof  Given  a  firewall  of  size  0(p)  we  may  without  loss  of  generality  assume  that 
the  set  {ni,  •  •  •  ,nm}  of  variables  constructed  in  Table  9  does  not  contain  any  vari¬ 
ables  not  in  the  firewall;  this  ensures  that  also  ((is  ww)wA[F])  |  17*  will  have  size  0(p). 
It  follows  from  Theorem  3  that  the  least  solution  can  be  found  in  time  0(p  +  tf3) 
where,  as  before,  q  is  the  number  of  elements  of  LabUBndUSNam.  This  dominates 
the  0(qs)  operations  needed  to  calculate  the  transitive  closure  I+  of  the  relation  I. 
Hence  the  overall  operation  of  Table  9  is  0(p3)&0(p  +  q3)  where  p  is  the  size  of  the 
firewall  and  q  =  0(p)  can  be  chosen  “arbitrarily”  small.  □ 

In  summary,  we  have  succeeded  in  using  the  control  flow  analysis  to  devise  a  cubic 
time  algorithm  for  correctly  validating  that  a  proposed  firewall  is  indeed  protective; 
unlike  [18]  this  development  applies  to  the  full  ambient  calculus.  By  judicious  choice 
of  the  sets  of  labels,  binders  and  markers  the  test  can  be  performed  in  near-linear 
time  (although  at  the  cost  of  precision);  this  gives  confidence  in  exploring  our  current 
approach  on  fully  realistic  internet  programs. 
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5  Conclusion 


Static  analysis  provides  a  summary  of  the  behaviour  of  programs;  we  have  shown 
that  classical  control  flow  analysis  techniques  can  be  adapted  to  tackle  the  much 
more  dynamic  setting  of  the  ambient  calculus.  Our  flow  logic  approach  facilitates 
fully  automatic  validation  of  an  analysis  estimate  as  well  as  fully  automatic  compu¬ 
tation  of  the  best  estimate;  in  spirit  this  is  rather  similar  to  the  approaches  of  type 
inference.  Type  systems  have  already  been  extensively  used  to  study  the  properties 
of  web-based  languages  and  related  calculi  (e.g.  [1, 8]);  the  use  of  flow  logic  offers  a 
flexible  approach  to  adapting  the  vast  amount  of  more  “traditional”  approaches  to 
static  analysis  [17]. 

In  this  paper  we  developed  a  control  flow  analysis  for  the  ambient  calculus  building 
on  recent  developments  for  the  7r-calculus  [2—4]  and  the  ambient  calculus  [14, 18, 19, 
22].  In  fact  a  notion  of  cryptography  is  implicitly  part  of  the  ambient  calculus  as 
presented  here.  As  in  the  spi-calculus  it  is  the  restriction  operator  that  is  used  to 
model  “secrets”  that  cannot  be  guessed  by  brute  force  attack.  Thus  a  message  M, 
encrypted  under  the  key  K,  is  represented  simply  as  the  ambient  K[M]  whereas  an 
attempt  at  decrypting  such  a  message  is  represented  by  the  ambient  open  K.  If  K 
is  a  secret  only  known  to  the  principals  Pi,  •  •  * ,  Pn  the  entire  system  is  represented 
as  (u K)(Pi  |  ■  •  *  |  Pn)  where  each  P{  may  contain  K[M]  as  well  as  openK\ 

More  importantly  we  demonstrated  how  a  careful  exploitation  of  the  detailed  op¬ 
eration  of  the  control  flow  analysis  allowed  us  to  construct  an  attacker  that  was 
as  hard  to  protect  against  as  any  other  attacker;  this  is  somewhat  reminiscent  of 
the  identification  of  hard  problems  in  a  given  complexity  class.  This  allowed  us  to 
predict  the  operation  of  the  firewall  in  conjunction  with  all  unaware  attackers  based 
on  its  operation  in  conjunction  with  the  hardest  attacker;  if  it  successfully  protects 
against  the  hard  attacker  it  will  also  protect  against  all  other  attackers  not  knowing 
the  required  passwords.  We  believe  this  to  be  typical  of  applications  where  software 
developed  by  subcontractors  is  validated  before  being  embedded  in  the  software 
system  under  construction.  To  make  this  practical  we  ensured  that  the  test  could 
be  performed  in  cubic  time. 

It  is  important  to  stress  that  we  circumvent  the  undecidability  of  dealing  with  all 
possible  execution  contexts,  in  particular  all  attackers,  by  coarsening  the  “level  of 
granularity”  of  our  observations  to  coincide  with  those  of  the  static  analysis.  We 
maintain  soundness  because  we  proved  the  static  analysis  to  be  sound  (but  of  course 
not  complete)  with  respect  to  the  dynamic  semantics. 

The  analysis  leading  to  the  hardest  attacker  can  be  compared  with  the  approaches  of 
the  “Dolev-Yao  tradition”  [10],  including  [5,23, 12].  Indeed,  the  analyses  performed 
in  these  studies  amount  to  an  “informal”  analysis  of  the  capabilities  of  the  attacker, 
leading  to  an  inductively  defined  behaviour.  Such  an  analysis  is  not  straightforward 
when  the  computational  capabilities  are  modified,  such  as  when  adding  mobility. 
The  advantage  of  our  approach  therefore  is  two-fold.  On  the  one  hand,  it  allows  us 
to  make  the  analysis  in  a  “formal”  manner,  because  we  can  prove  theorems  with 
respect  to  the  dynamic  semantics  (as  mitigated  by  the  analysis).  On  the  other  hand, 
the  techniques  used  to  implement  the  analysis  provide  insights  on  how  to  define  the 
capabilities  of  the  hardest  attacker. 

In  summary,  we  have  illustrated  a  novel  approach  to  the  validation  of  safety  and 
security  properties  of  software  systems.  We  are  hopeful  that  it  will  scale  up  to 
other  calculi  and  web-based  languages  with  explicit  cryptographic  primitives.  By 
considering  flow  logics  that  express  more  powerful  analyses  it  is  likely  that  one 
can  capture  attackers  in  a  more  precise  manner;  perhaps  one  can  show  that  the 
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firewall  protects  againsts  attackers  knowing  only  some  of  the  secret  passwords  or 
even  against  attackers  knowing  all  the  secret  passwords  but  unable  to  use  them  in 
the  appropriate  manner  (as  might  be  the  case  for  an  agent  representing  a  “courier” 
service). 
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GIVEN: 


a  bounded  number  of  binary  relations  i?i,  •  •  ■ ,  iL, 
and  a  bound  on  the  size  of  each  conditional  constraint. 

INPUT:  a  set  C  of  conditional  constraints  over  the  binary  relations. 

OUTPUT:  the  least  solution  (Ri, • • • , Rm)  such  that  (Ri,  •  •  •  ,Rm)  \=  C. 

INITIALISE:  for  i  :=  1  to  m  do 

for  each  argument  arg  occurring  in  C  do 
R i(arg)  :>=  0; 

for  each  set  C  var  occurring  in  C  do 
INFL [set,  var]  NIL; 

cno  :=  0;  stack  :=  NIL; 

for  each  setk  C  var *  =$►  •  •  •  seti  C  var\  =>  seto  C  varo  in  C  do 
cno:=  cno  +  i; 

CT[cno]  ( setk  Q  varjt  •  -  •  set i  C  vari  =$►  seto  C  varo); 
if  k  =  0  then  push ((se^o,  varo))  else 
for  i  :=  1  to  k  do 

INFL  [set  i ,  var  i]  :=  CONS  (cno,  INFL  [set  t ,  van] ) ; 

ITERATE:  while  stack  ^  NIL  do 

let  (set,  var)  =  pop()  in 
for  cno  in  INFL  [se^ ,  var]  do 

let  (se£fc  C  vark  =>  •  •  •  set\  C  vari  seto  C  varo)  K  CT[cno]  in 
if  Ai=ilset  t]  Q  [var»J  then  push ((set0,  varo))  else  skip; 

USING:  procedure  push  ((set,  var))  is 

if  [setj  C  [var)  then  skip  else 

[varj  :*=  [var|  U  [set];  stack  :=  C0NS((sei,  var) , stack) ; 

function  pop()  is 

let  (set,  var)  =  HEAD(stack)  in 

stack  TAIL  (stack)  ;  return((5e<,  var)) ; 

Table  10.  Worklist  solution  of  constraints. 


22.  H.  Riis  Nielson  and  F.  Nielson.  Shape  analysis  for  mobile  ambients.  In  Proceedings  of 
POPL’OO ,  pages  142-154.  ACM  Press,  2000. 

23.  L.  C.  Paulson.  Proving  properties  of  security  protocols  by  induction.  In  Proc.  10th 
Computer  Security  Foundations  Workshop ,  pages  70-83.  IEEE,  1997. 


A  Constraint  Solving 


We  now  consider  how  to  turn  a  set  C  of  conditional  constraints  of  size  |C|  into  the 
smallest  acceptable  analysis  estimate.  To  this  end  we  develop  a  worklist  algorithm 
(see  e.g.  [17,  Chapter  6]).  It  operates  on  data  structures  Ri,  *•  *,Rm  corresponding 
to  a  constant  number  of  binary  relations.  Additionally,  it  makes  use  of  a  list  stack 
of  “bit  positions”  that  have  just  been  set  to  1  and  whose  consequences  remain  to 
be  explored,  a  table  CT  that  maps  constraint  numbers  to  the  constraint  in  question, 
and  a  table  INFL  that  for  each  “bit  position”  gives  a  list  of  constraint  numbers 
influenced  by  that  “bit  position”.  We  operate  on  lists  using  the  constructors  CONS 
and  NIL  and  the  destructors  HEAD  and  TAIL.  To  operate  on  stack  we  make  use  of 
the  function  push  for  setting  a  “bit  position”  to  1  and  for  placing  the  “bit  position” 
on  stack,  and  of  the  function  pop  for  returning  the  topmost  “bit  position”  on  stack 
and  at  the  same  time  removing  it  from  stack. 
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The  worklist  algorithm  is  displayed  in  Table  10.  The  initialisation  of  Ri,-’*>Rm 
amounts  to  setting  (Rj,  •  •  • ,  Rm)  equal  to  the  least  element  (±,  •  •  • ,  -L)  of  the  appro¬ 
priate  lattice  of  values.  The  initialisation  next  sets  stack  and  all  INFL  [se£ ,  var]  to 
NIL  and  then  computes  the  correct  contents  of  the  structures  stack,  CT  and  INFL  by 
iterating  through  all  conditional  constraints  in  C.  The  iteration  phase  amounts  to 
propagating  “bit  positions”  as  long  as  there  are  new  “bit  positions”  recently  set  to 
1  whose  consequences  need  to  be  explored.  We  write  {set]  and  {variable]  whenever 
we  interpret  the  entities  set  and  var  as  denoting  values  and  variables,  i.e.  whenever 
they  are  used  for  comparisons  or  for  updating  the  data  structures.  The  algorithm 
clearly  terminates  because  no  “bit  position”  is  placed  twice  on  stack  thanks  to  the 
test  performed  by  the  push  operation. 

We  can  now  state  the  correctness  of  the  worklist  algorithm: 

Lemma  5-  Table  10  computes  (l{(Hi,  •  •  • ,  Rm)  |  (Hi,  •  •  • , Rm)  b=  C}. 

Proof \  Write  (_L,  •  •  • ,  _L)  for  the  least  element  and  write 


for  the  least  solution  (guaranteed  by  the  obvious  extension  of  Theorem  2).  It  follows 
from  the  “Horn  clause”  format  of  the  conditional  constraints  that  the  invariant 


(±,...,J-)C(Ri,.--,R ,m)C(HI,.-.,iO 


remains  fulfilled  throughout  the  iteration.  Next  write  (Ri,  •  •  •  ,Rm)  for  the  resulting 
value.  It  follows  from  the  above  that 


(Ri ,  ■  •  • ,  Rm)  C  (HJ ,  •  •  • ,  R^) 


and  since  it  is  clear  that  all  constraints  in  C  are  fulfilled  upon  termination  it  follows 
that 

(Ri,  *  *  *  j Rm)  H3 

Using  the  definition  of  (HJ,  •  •  • ,  HJ^)  we  then  have 

(Rlj  )  Rm)  ==  n{(Hi,  •  •  •  ,  Rm)  I  (Hi,  •  ♦  *  ,  Rm)  H  £} 
showing  that  the  algorithm  computes  the  least  solution.  □ 

To  state  the  complexity  of  the  worklist  algorithm  we  assume  that  there  is  a  constant 
bound  on  the  size  of  the  conditional  constraints  occurring  in  C : 

Lemma  6  (Special  case  of  [11]).  Table  10  operates  in  time  0(|C|). 

Proof.  The  initialisation  of  Ri,  •  •  *  ,Rm  takes  time  0(|C|).  Setting  INFL  [sei,  var]  to 
NIL  also  takes  time  0(|C|).  The  remaining  initialisation  performs  0(1)  steps  for 
each  of  the  0(|C|)  conditional  constraints  (recalling  that  k  =  0(1),  i.e.  the  length  of 
conditions  is  bounded  by  a  constant).  For  the  iteration  at  most  0(|C|)  “bit  positions” 
are  placed  on  the  stack  thanks  to  the  test  performed  by  the  push  operation  and 
to  the  fact  that  a  “bit  position”  set  to  1  is  never  reset  to  0  given  the  “Horn  clause” 
format  of  the  conditional  constraints.  Ignoring  the  inspection  of  INFL  [ set ,  var ]  we 
therefore  use  time  0(|C|).  The  inspection  of  INFL  [set,  var]  can  be  amortised  over 
all  iterations  since  each  constraint  is  only  considered  0(1)  times  (again  because 
k  =  0(1))  and  thus  takes  overall  time  0(|C|).  □ 
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Abstract 

We  deal  with  the  problem  of  a  center  sending  a  message  to  a  group  of  users  such  that  some  subset 
of  the  users  is  considered  revoked  and  should  not  be  able  to  obtain  the  content  of  the  message.  We 
concentrate  on  the  stateless  receiver  case,  where  the  users  do  not  (necessarily)  update  their  state  from 
session  to  session.  We  present  a  framework  called  the  Subset-Cover  framework,  which  abstracts  a  variety 
of  revocation  schemes  including  some  previously  known  ones.  We  provide  sufficient  conditions  that 
guarantees  the  security  of  a  revocation  algorithm  in  this  class. 

We  describe  two  explicit  Subset-Cover  revocation  algorithms;  these  algorithms  are  very  flexible  and 
work  for  any  number  of  revoked  users.  The  schemes  require  storage  at  the  receiver  of  log  N  and  |  log2  N 
keys  respectively  (TV  is  the  total  number  of  users),  and  in  order  to  revoke  r  users  the  required  message 
lengths  are  of  r  log  N  and  2 r  keys  respectively.  We  also  provide  a  general  traitor  tracing  mechanism  that 
can  be  integrated  with  any  Subset-Cover  revocation  scheme  that  satisfies  a  “bifurcation  property”.  This 
mechanism  does  not  need  an  a  priori  bound  on  the  number  of  traitors  and  does  not  expand  the  message 
length  by  much  compared  to  the  revocation  of  the  same  set  of  traitors. 

The  main  improvements  of  these  methods  over  previously  suggested  methods,  when  adopted  to 
the  stateless  scenario,  are:  (1)  reducing  the  message  length  to  O(r)  regardless  of  the  coalition  size 
while  maintaining  a  single  decryption  at  the  user’s  end  (2)  provide  a  seamless  integration  between 
the  revocation  and  tracing  so  that  the  tracing  mechanisms  does  not  require  any  change  to  the  revocation 
algorithm. 


1  Introduction 

The  problem  of  a  Center  transmitting  data  to  a  large  group  of  receivers  so  that  only  a  predefined  subset 
is  able  to  decrypt  the  data  is  at  the  heart  of  a  growing  number  of  applications.  Among  them  are  pay-TV 
applications,  multicast  communication,  secure  distribution  of  copyright-protected  material  (e.g.  music)  and 
audio  streaming.  The  area  of  Broadcast  Encryption  deals  with  methods  to  efficiently  broadcast  informa¬ 
tion  to  a  dynamically  changing  group  of  users  who  are  allowed  to  receive  the  data.  It  is  often  convenient  to 
think  of  it  as  a  Revocation  Scheme,  which  addresses  the  case  where  some  subset  of  the  users  are  excluded 
from  receiving  the  information.  In  such  scenarios  it  is  also  desirable  to  have  a  Tracing  Mechanism,  which 
enables  the  efficient  tracing  of  leakage,  specifically,  the  source  of  keys  used  by  illegal  devices,  such  as  pirate 
decoders  or  clones. 

One  special  case  is  when  the  receivers  are  stateless.  In  such  a  scenario,  a  receiver  is  not  capable  of 
recording  the  past  history  of  transmissions  and  change  its  state  accordingly.  Instead,  its  operation  must  be 
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based  on  the  current  transmission  and  its  initial  configuration.  Stateless  receivers  are  important  for  the  case 
where  the  receiver  is  a  device  that  is  not  constantly  on-line,  such  as  a  media  player  (e.g.  a  CD  or  DVD  player 
where  the  “transmission”  is  the  current  disc),  a  satellite  receiver  (GPS)  and  perhaps  in  multicast  applications. 

This  paper  introduces  veiy  efficient  revocation  schemes  which  are  specifically  suitable  for  stateless 
receivers.  Our  approach  is  quite  general.  We  define  a  framework  of  such  algorithms,  called  Subset-Cover 
algorithms,  and  provide  a  sufficient  condition  for  an  algorithm  in  this  family  to  be  secure.  Furthermore,  we 
suggest  two  particular  implementations  for  schemes  in  this  family;  the  performance  of  the  second  method 
is  substantially  better  than  any  previously  known  algorithm  for  this  problem  (see  Section  1.1).  We  also 
provide  a  general  property  (‘bifurcation’)  of  revocation  algorithms  in  our  framework  that  allows  efficient 
tracing  methods,  without  modifying  the  underlying  revocation  scheme. 

Copyright  Protection 

An  important  application  that  motivates  the  study  of  revocation  and  tracing  mechanisms  is  Copyright  Pro¬ 
tection.  The  distribution  of  copyright  protected  content  for  (possibly)  disconnected  operations  involves 
encryption  of  the  content  on  a  media.  The  media  (such  as  CD,  DVD  or  a  flash  memory  card)  typically  con¬ 
tains  in  its  header  the  encryption  of  the  key  K  which  encrypts  the  content  following  the  header.  Compliant 
devices,  or  receivers,  store  appropriate  decryption  keys  that  can  be  used  to  decrypt  the  header  and  in  turn 
decrypt  the  content.  A  copyright  protection  mechanism  defines  the  algorithm  which  assigns  keys  to  devices 
and  encrypts  the  content. 

An  essential  requirement  from  a  copyright  protection  mechanism  is  the  ability  to  revoke,  during  the 
lifetime  of  the  system,  devices  that  are  being  used  illegally.  It  is  expected  that  some  devices  will  be  com¬ 
promised,  either  via  reverse  engineering  or  due  to  sloppy  manufacturing  of  the  devices.  As  a  result  keys  of 
a  number  of  compromised  devices  can  then  be  cloned  to  form  a  decrypting  unit.  This  copyright  protection 
violation  can  be  combated  by  revoking  the  keys  of  the  compromised  devices.  Note  that  devices  are  stateless 
as  they  are  assumed  to  have  no  capability  of  dynamically  storing  any  information  (other  than  the  original 
information  that  is  stored  at  the  time  of  manufacturing)  and  also  since  they  are  typically  not  connected  to 
the  world  (except  via  the  media).  Hence,  it  is  the  responsibility  of  the  media  to  carry  the  current  state  of  the 
system  at  the  time  of  recording  in  terms  of  revoked  devices. 

It  is  also  highly  desirable  that  the  revocation  algorithm  be  coupled  with  a  traitors  tracing  mechanism. 
Specifically,  a  well-designed  copyright  protection  mechanism  should  be  able  to  combat  piracy  in  the  form 
of  illegal  boxes  or  clone  decoding  programs.  Such  decoders  typically  contain  the  identities  of  a  number  of 
devices  that  are  cooperating;  furthermore,  they  are  hard  to  disassemble1.  The  tracing  mechanism  should 
therefore  treat  the  illicit  decoder  as  a  black  box  and  simply  examine  its  input/output  relationship.  A  combi¬ 
nation  of  a  revocation  and  a  tracing  mechanism  provides  a  powerful  tool  for  combating  piracy:  finding  the 
identities  of  compromised  devices,  revoking  them  and  rendering  the  illegal  boxes  useless. 

Caveat.  The  goal  of  a  copyright  protection  mechanism  is  to  create  a  legal  channel  of  distribution  of 
content  and  to  disallow  its  abuse.  As  a  consequence,  an  illegal  distribution  will  require  the  establishment 
of  alternative  channels  and  should  not  be  able  to  piggyback  on  the  legitimate  channel2.  Such  alternative 
channels  should  be  combated  using  other  means  and  is  not  under  the  scope  of  the  techniques  developed  in 
this  paper,  thought  techniques  such  as  revocation  may  be  a  useful  deterrent  against  rough  users. 

'For  instance,  the  software  clone  known  as  DeCSS,  that  cracked  the  DVD  Video  “encryption”,  is  shielded  by  a  tamper-resistant 
software  tool  which  makes  it  very  hard  to  reverse  engineer  its  code  and  know  its  details  such  as  receivers  identities  or  its  decoding 
strategy. 

2For  instance  in  the  case  of  cable  TV  the  pirates  should  be  forced  to  create  their  own  cable  network. 


2 


1.1  Related  Work 


Broadcast  Encryption.  The  area  of  Broadcast  Encryption  was  first  formally  studied  (and  coined)  by 
Fiat  and  Naor  in  [23]  and  has  received  much  attention  since  then.  To  the  best  of  our  knowledge  the  scenario 
of  stateless  receivers  has  not  been  considered  explicitly  in  the  past  in  a  scientific  paper,  but  in  principle  any 
scheme  that  works  for  the  connected  mode  may  be  converted  to  a  scheme  for  stateless  receivers.  (Such 
conversion  may  require  including  with  any  transmission  the  entire  ‘history’  of  revocation  events.)  When 
discussing  previously  proposed  schemes  we  will  consider  their  performance  when  adapted  to  the  stateless 
receiver  scenario. 

To  survey  previous  results  we  should  fix  our  notation.  Let  N  be  the  total  number  of  users  in  the  system 
let  r  be  the  size  of  the  revoked  set  H.  Another  important  parameter  that  is  often  considered  is  t,  the  upper 
bound  on  the  size  of  the  coalition  an  adversary  can  assemble.  The  algorithms  in  this  paper  do  not  require 
such  a  bound  and  we  can  think  of  t  —  r;  on  the  other  hand  some  previously  proposed  schemes  depend  on 
t  but  are  independent  of  r.  The  Broadcast  Encryption  method  of  [23]  is  one  such  scheme  which  allows  the 
removal  of  any  number  of  users  as  long  as  at  most  t  of  them  collude.  There  the  message  length  is  0(t  log2  £), 
a  user  must  store  a  number  of  keys  that  is  logarithmic  in  t  and  the  amount  of  work  required  by  the  user  is 
0(r/t)  decryptions. 

The  logical-tree-hierarchy  (LKH)  scheme,  suggested  independently  by  Wallner  et.  al.  [42]  and  Wong 
et.  al.  [43],  is  designed  for  the  connected  mode  for  multicast  re-keying  applications.  It  revokes  a  single  user 
at  a  time,  and  updates  the  keys  of  all  remaining  users.  If  used  in  our  scenario,  it  requires  a  transmission  of 
2 r  log  N  keys  to  revoke  r  users,  each  user  should  store  log  AT  keys  and  the  amount  of  work  each  user  should 
do  is  r  log  N  encryptions  (the  expected  number  is  O(r)  for  an  average  user).  These  bounds  are  somewhat 
improved  in  [12, 13, 32],  but  unless  the  storage  at  the  user  is  extremely  high  they  still  require  a  transmission 
of  length  H(r  log  AT).  The  key  assignment  of  thjg  scheme  and  the  key  assignment  of  our  first  method  are 
similar;  see  further  discussion  on  comparing  the  two  methods  in  Section  3.1. 

Luby  and  Staddon  [31]  considered  the  information  theoretic  setting  and  devised  bounds  for  any  revo¬ 
cation  algorithms  under  this  setting.  Their  “Or  Protocol”  fits  our  Subset-Cover  framework.  In  fact,  as  we 
show  in  Section  3.3,  our  second  algorithm  (the  Subset  Difference  method)  which  is  not  information  theo¬ 
retic,  beats  their  lower  bound  (Theorem  12  in  [31]).  Garay,  Staddon  and  Wool  [27]  introduced  the  notion 
of  long-lived  broadcast  encryption .  In  this  scenario,  keys  of  compromised  decoders  are  no  longer  used  for 
encryptions.  The  question  they  address  is  how  to  adapt  the  broadcast  encryption  scheme  so  as  to  maintain 
the  security  of  the  system  for  the  good  users. 

The  method  of  Kumar  et.  al  [30]  enables  one-time  revocation  of  up  to  r  users  with  message  lengths  of 
0(r  log  AT)  and  0(r2). 

CPRM  [18]  is  one  of  the  methods  that  explicitly  considers  the  stateless  scenario.  Our  Subset  Difference 
method  outperforms  CPRM  by  a  factor  of  about  25  in  the  number  of  revocations  it  can  handle  when  all  the 
other  parameters  are  fixed;  Section  4.5  contains  a  detailed  description  and  comparison. 

Tracing  Mechanisms.  The  notion  of  a  tracing  system  was  first  introduced  by  Chor,  Fiat  and  Naor  in  [16], 
and  was  later  refined  to  the  Threshold  Traitor  model  in  [34],  [17].  Its  goal  is  to  distribute  decryption  keys  to 
the  users  so  as  to  allow  the  detection  of  at  least  one  key  that  is  used  in  a  pirate  box  or  clone  using  keys  of  at 
most  t  users.  Black-box  tracing  assumes  that  only  the  outcome  of  the  decoding  box  can  be  examined.  [34], 
[17]  provide  combinatorial  and  probabilistic  constructions  that  guarantee  tracing  with  high  probability.  To 
trace  t  traitors,  they  require  each  user  to  store  0{t  log  AT)  keys  and  to  perform  a  single  decryption  operation. 
The  message  length  is  At.  The  public  key  tracing  scheme  of  Boneh  and  Franklin  [7]  provides  a  number- 
theoretic  deterministic  method  for  tracing.  Note  that  in  all  of  the  above  methods  t  is  an  a-priori  bound. 

Preventing  Leakage  of  Keys.  The  problem  of  preventing  illegal  leakage  of  keys  has  been  attacked  by  a 
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number  of  quite  different  approaches.  The  legal  approach,  suggested  by  Pfitzmann  [39],  requires  a  method 
that  not  only  traces  the  leaker  but  also  yields  a  proof  for  the  liability  of  the  traitor  (the  user  whose  keys  are 
used  by  an  illegal  decoder).  Hence,  leakage  can  be  fought  via  legal  channels  by  presenting  this  proof  to 
a  third  party.  The  self  enforcement  approach,  suggested  by  Dwork,  Lotspiech  and  Naor  [20],  aims  at 
deterring  users  from  revealing  their  personal  keys.  The  idea  is  to  provide  a  user  with  personal  keys  that 
contain  some  sensitive  information  about  the  user  which  the  user  will  be  reluctant  to  disclose.  The  trace- 
and-revoke  approach  is  to  design  a  method  that  can  trace  the  identity  of  the  user  whose  key  was  leaked;  in 
turn,  this  user’s  key  is  revoked  from  the  system  for  future  uses.  The  results  in  this  paper  fall  into  the  latter 
category,  albeit  in  a  slightly  relaxed  manner.  Although  our  methods  assure  that  leaked  keys  will  become 
useless  in  future  transmissions,  it  may  not  reveal  the  actual  identities  of  all  leaking  keys,  thus  somewhat 
lacking  self-enforcement. 

Content  Tracing:  In  addition  to  tracing  leakers  who  give  away  their  private  keys  there  are  methods  that 
attempt  to  detect  illegal  users  who  redistribute  the  content  after  it  is  decoded.  This  requires  the  assumption 
that  good  watermarking  techniques  with  the  following  properties  are  available:  it  is  possible  to  insert  one 
of  several  types  of  watermarks  into  the  content  so  that  the  adversary  cannot  create  a  “clean”  version  with 
no  watermarks  (or  a  watermark  it  did  not  receive).  Typically,  content  is  divided  into  segments  that  are 
watermarked  separately.  This  setting  with  protection  against  collusions  was  first  investigated  by  Boneh  and 
Shaw  [9].  A  related  setting  with  slightly  stronger  assumptions  on  the  underlying  watermarking  technique 
was  investigated  in  [24, 5, 41].  By  introducing  the  time  dimension,  Fiat  and  Tassa  [24]  propose  the  dynamic 
tracing  scenario  in  which  the  watermarking  of  a  segment  depends  on  feedback  from  the  previous  segment 
and  which  detects  all  traitors.  Their  algorithm  was  improved  by  Berkman,  Pamas  and  Sgall  [5],  and  a  scheme 
which  requires  no  real-time  computation/feedback  for  this  model  was  given  in  Safavani-Naini  and  Wang 
[41].  Content  tracing  is  relevant  to  our  scenario  in  that  any  content  tracing  mechanism  can  be  combined 
with  a  key-revocation  method  to  ensure  that  the  traced  users  are  indeed  revoked  and  do  not  receive  new 
content  in  the  future.  Moreover,  the  tracing  methods  of  [24]  are  related  to  the  tracing  algorithm  of  Section 
5.2. 

Integration  of  tracing  and  revocation.  Broadcast  encryption  can  be  combined  with  tracing  schemes  to 
yield  trace-and-revoke  schemes.  While  Gafni  et.  al.  [26]  only  consider  combinatorial  constructions,  the 
schemes  in  Naor  and  Pinkas  [35]  are  more  general.  The  previously  best  known  trace-and-revoke  algorithm 
of  [35]  can  tolerate  a  coalition  of  at  most  t  users.  It  requires  to  store  0(t)  keys  at  each  user  and  to  perform 
0(r)  decryptions;  the  message  length  is  r  keys,  however  these  keys  are  elements  in  a  group  where  the 
Decisional  Diffie-Hellman  problem  is  difficult  and  therefore  these  keys  may  be  longer  than  symmetric  keys. 
The  tracing  model  of  [35]  is  not  a  “pure”  black-box  model.  Anzai  et.  al  [2]  employs  a  similar  method  for 
revocation,  but  without  tracing  capabilities.  Our  results  improve  upon  this  algorithm  both  in  the  work  that 
must  be  performed  at  the  user  and  in  the  lengths  of  the  keys  transmitted  in  the  message. 

1 .2  Summary  of  Results 

In  this  paper  we  define  a  generic  framework  encapsulating  several  previously  proposed  revocation  methods, 
called  Subset-Cover  algorithms.  These  algorithms  are  based  on  the  principle  of  covering  all  non-revoked 
users  by  disjoint  subsets  from  a  predefined  collection.  We  define  the  security  of  a  revocation  scheme  and  pro¬ 
vide  a  sufficient  condition  (key-indistinguishability)  for  a  revocation  algorithm  in  the  Subset-Cover  Frame¬ 
work  to  be  secure.  An  important  consequence  of  this  framework  is  the  separation  between  long-lived  keys 
and  short-term  keys.  The  framework  can  be  easily  extended  to  the  public-key  scenario. 

We  then  provide  two  new  instantiations  of  revocation  schemes  in  the  Subset-Cover  Framework,  with  a 
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Method 

Message  Length 

Storage  @  Receiver 

Processing  time 

no.  Decryptions 

Complete  Subtree 

rlog^r 

log  TV 

0(log  log  TV) 

1 

Subset  Difference 

2r  —  1 

h  log2  N 

0  (log  TV) 

1 

Figure  1:  Performance  Tradeoff  for  the  Complete  Subtree  method  and  the  Subset  Difference  method 

different  performance  tradeoff  (summarized  in  Table  1 .2^).  Both  instantiations  are  tree-based,  namely  the 
subsets  are  derived  from  a  virtual  tree  structure  imposed  on  all  devices  in  the  system.  The  first  requires  a 
message  length  of  r  log  TV  and  storage  of  log  TV  keys  at  the  receiver  and  constitutes  a  moderate  improvement 
over  previously  proposed  schemes;  the  second  exhibits  a  substantial  improvement:  it  requires  a  message 
length  of  2r  —  1  (in  the  worst  case,  or  1.38r  in  the  average  case)  and  storage  of  |  log2  TV  keys  at  the  receiver. 
N  is  the  total  number  of  devices,  and  r  is  the  number  of  revoked  devices.  Furthermore,  these  algorithms  are 
r-flexible,  namely  they  do  not  assume  an  upper  bound  of  the  number  of  revoked  receivers. 

Thirdly,  we  present  a  tracing  mechanism  that  works  in  tandem  with  a  Subset-Cover  revocation  scheme. 
We  identify  the  bifurcation  property  for  a  Subset-Cover  scheme.  Our  two  constructions  of  revocation 
schemes  posses  this  property.  We  show  that  every  scheme  that  satisfies  the  bifurcation  property  can  be 
combined  with  the  tracing  mechanism  to  yield  a  trace-and-revoke  scheme.  The  integration  of  the  two  mech¬ 
anisms  is  seamless  in  the  sense  that  no  change  is  required  for  any  one  of  them.  Moreover,  no  a"-frriori  bound 
on  the  number  of  traitors  is  needed  for  our  tracing  scheme.  In  order  to  trace  t  illegal  users,  the  first  revo¬ 
cation  method  requires  a  message  length  of  flog  TV,  and  the  second  revocation  method  requires  a  message 
length  of  5t. 

Main  Contributions:  the  main  improvements  that  our  methods  achieve  over  previously  suggested  meth¬ 
ods,  when  adopted  to  the  stateless  scenario,  are: 

•  Reducing  the  message  length  to  linear  in  r  regardless  of  the  coalition  size,  while  maintaining  a  single 
decryption  at  the  user’s  end.  This  applies  also  to  the  case  where  public  keys  are  used,  without  a 
substantial  length  increase. 

•  The  seamless  integration  between  revocation  and  tracing:  the  tracing  mechanism  does  not  require 
any  change  of  the  revocation  algorithm  and  no  a  priori  bound  on  the  number  of  traitors,  even  when  all 
traitors  cooperate  among  themselves. 

•  The  rigorous  treatment  of  the  security  of  such  schemes,  identifying  the  effect  of  parameter  choice  on 
the  overall  security  of  the  scheme. 

Organization  of  the  paper.  Section  2  describes  the  framework  for  Subset-Cover  algorithms.  Section  3 
describes  two  specific  implementations  of  such  algorithms.  Section  4  presents  extensions,  implementation 
issues,  public-key  methods,  application  to  multicasting  as  well  as  casting  the  recently  proposed  CPRM 
method  (for  DVD-audio  and  SD  cards)  in  the  Subset-Cover  Framework.  Section  5  provides  the  traitors 
tracing  extensions  to  Subset-Cover  revocation  algorithms  and  their  seamless  integration.  In  Section  6  we 
define  the  ”key-indistinguishability”  property  and  provide  the  main  theorem  characterizing  the  security  of 
revocation  algorithms  in  the  Subset-Cover  framework. 

3Note  that  the  comparison  in  the  processing  time  between  the  two  methods  treats  an  application  of  a  pseudo-random  generator 
and  a  lookup  operation  as  having  the  same  cost,  even  though  they  might  be  quite  different.  More  explicitly,  the  processing  of  both 
methods  consists  of  O(loglogiV)  lookups;  in  addition,  the  Subset  Difference  method  requires  at  most  logiV  applications  of  a 
pseudo-random  generator. 
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2  The  Subset-Cover  Revocation  Framework 


2.1  Preliminaries  -  Problem  Definition 

Let  Jf  be  the  set  of  all  users,  |A/*|  =  N>  and  71  c  N  be  a  group  of  \7Z\  =  r  users  whose  decryption  privileges 
should  be  revoked.  The  goal  of  a  revocation  algorithm  is  to  allow  a  center  to  transmit  a  message  M  to  all 
users  such  that  any  user  u  e  M  \  H  can  decrypt  the  message  correctly,  while  even  a  coalition  consisting  of 
all  members  of  71  cannot  decrypt  it.  The  exact  definition  of  the  latter  is  provided  in  Section  6. 

A  system  consists  of  three  parts:  (1)  An  initiation  scheme,  which  is  a  method  for  assigning  the  receivers 
secret  information  that  will  allow  them  to  decrypt.  (2)  The  broadcast  algorithm  -  given  a  message  M  and 
the  set  H  of  users  that  should  be  revoked  outputs  a  ciphertext  message  M'  that  is  broadcast  to  all  receivers. 
(3)  A  decryption  algorithm  -  a  (non-revoked)  user  that  receives  ciphertext  Mf  using  its  secret  information 
should  produce  the  original  message  M.  Since  the  receivers  are  stateless,  the  output  of  the  decryption  should 
be  based  on  the  current  message  and  the  secret  information  only. 

2.2  The  Framework 

We  present  a  framework  for  algorithms  which  we  call  Subset-Cover.  In  this  framework  an  algorithm  defines 
a  collection  of  subsets  Si, . . . ,  Sw,  Sj  C  Jf.  Each  subset  Sj  is  assigned  (perhaps  implicitly)  a  long-lived 
key  Lf  each  member  u  of  Sj  should  be  able  to  deduce  Lj  from  its  secret  information.  Given  a  revoked  set 
71 ,  the  remaining  users  J\f  \  71  are  partitioned  into  disjoint  subsets  S^ , . . . ,  Sim  so  that 

m 

^\rc=U  Sij 

j=i 

and  a  session  key  K  is  encrypted  m  times  with  . . ,  Ljm. 

Specifically,  an  algorithm  in  the  framework  uses  two  encryption  schemes: 

•  A  method  Fk  :  {0, 1}*  {0, 1}*  to  encrypt  the  message  itself.  The  key  K  used  will  be  chosen  fresh 

for  each  message  M  -  a  session  key  -  as  a  random  bit  string.  Fk  should  be  a  fast  method  and  should 
not  expand  the  plaintext.  The  simplest  implementation  is  to  Xor  the  message  M  with  a  stream  cipher 
generated  by  K. 

•  A  method  to  deliver  the  session  key  to  the  receivers,  for  which  we  will  employ  an  encryption  scheme. 
The  keys  L  here  are  long-lived.  The  simplest  implementation  is  to  make  El  :  {0, 1}*  i-*  {0, 1}*  a 
block  cipher. 

An  exact  discussion  of  security  requirements  of  these  primitives  is  given  in  Section  6.  Some  suggestions  for 
the  implementation  of  Fk  and  El  are  given  in  Section  4.1.  The  algorithm  consists  of  three  components: 

Scheme  Initiation  :  Every  receiver  u  is  assigned  private  information  Iu.  For  all  1  <  i  <  w  such  that  u  e  Si , 
Iu  allows  u  to  deduce  the  key  Li  corresponding  to  the  set  Si.  Note  that  the  keys  Li  can  be  chosen  either  (i) 
uniformly  at  random  and  independently  from  each  other  (which  we  call  the  information-theoretic  case)  or 
(ii)  as  a  function  of  other  (secret)  information  (which  we  call  the  computational  case),  and  thus  may  not  be 
independent  of  each  other. 

The  Broadcast  algorithm  at  the  Center: 

1.  Choose  a  session  encryption  key  K. 
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2.  Given  a  set  1Z  of  revoked  receivers,  the  center  finds  a  partition  of  the  users  in  M  —  H  into  disjoint 
subsets  , . . . ,  Sim .  Let  , . . . ,  Lim  be  the  keys  associated  with  the  above  subsets. 

3.  The  center  encrypts  K  with  keys  Lix , . . . ,  Lim  and  sends  the  ciphertext 

<[n, *2,  - .  - , im, ELil  (K),  ELi2  (K), . . . , ELim  (*)], Fk{M)) 

The  portion  in  square  brackets  preceding  Fk{M)  is  called  the  header  and  FK{M)  is  called  the  body. 
The  Decryption  step  at  the  receiver  u ,  upon  receiving  a  broadcast  message 

{[Hi  ^2j  •  *  •  5  @1 1  @2 1  •  •  •  j  M  )  . 

1.  Find  ij  such  that  u  €  S{.  (in  case  u  G  71  the  result  is  null). 

2.  Extract  the  corresponding  key  Ly  from  Iu. 

3.  Compute  Dl{  .  ( Cj ))  to  obtain  K. 

4.  Compute  Dk{M1)  to  obtain  and  output  M. 

A  particular  implementation  of  such  scheme  is  specified  by  (1)  the  collection  of  subsets  5i, . . . ,  Sw  (2) 
the  key  assignment  to  each  subset  in  the  collection  (3)  a  method  to  cover  the  non-revoked  receivers  Jsf  \  % 
by  disjoint  subsets  from  this  collection,  and  (4)  A  method  that  allows  each  user  u  to  find  its  cover  S  and 
compute  its  key  Ls  from  Iu.  The  algorithm  is  evaluated  based  upon  three  parameters: 

i.  Message  Length  -  the  length  of  the  header  that  is  attached  to  FK(M),  which  is  proportional  to  m,  the 
number  of  sets  in  the  partition  covering 

ii.  Storage  size  at  the  receiver  -  how  much  private  information  (typically,  keys)  does  a  receiver  need 
to  store.  For  instance,  Iu  could  simply  consists  of  all  the  keys  Si  such  that  u  €  Si,  or  if  the  key 
assignment  is  more  sophisticated  it  should  allow  the  computation  of  all  such  keys. 

iii.  Message  processing  time  at  receiver.  We  often  distinguish  between  decryption  and  other  types  of 
operations. 

It  is  important  to  characterize  the  dependence  of  the  above  three  parameters  in  both  N  and  r.  Specif¬ 
ically,  we  say  that  a  revocation  scheme  is  flexible  with  respect  to  r  if  the  storage  at  the  receiver  is  not  a 
function  of  r.  Note  that  the  efficiency  of  setting  up  the  scheme  and  computing  the  partition  (given  U)  is  not 
taken  into  account  in  the  algorithm’s  analysis.  However,  for  all  schemes  presented  in  this  paper  the  com¬ 
putational  requirements  of  the  sender  are  rather  modest:  finding  the  partition  takes  time  linear  in  \7l\  log  J\f 
and  the  encryption  is  proportional  to  the  number  of  subsets  in  the  partition. 

2.3  Security  of  the  Framework:  Key-Indistinguishability 

Section  6  discusses  in  detail  the  security  of  an  algorithm  in  the  Subset-Cover  framework.  Intuitively,  we 
identify  a  critical  property  that  is  required  from  the  key-assignment  method  in  order  to  provide  a  secure 
Subset-Cover  algorithm.  We  say  that  a  subset-cover  algorithm  satisfies  the  ’’key-indistinguishability”  prop¬ 
erty  if  for  every  subset  Si  its  key  Li  is  indistinguishable  from  a  random  key  given  all  the  information  of  all 
users  that  are  not  in  Si.  Note  that  any  scheme  in  which  the  keys  to  all  subsets  are  chosen  independently 
satisfies  this  property.  We  then  proceed  to  show  that  any  subset-cover  algorithm  that  satisfies  the  key- 
indistinguishability  property  provides  a  secure  encryption  of  the  message  (the  overall  encryption  security  is 
expressed  as  a  function  of  the  security  provided  by  E  and  F). 
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3  Two  Subset-Cover  Revocation  Algorithms 

We  now  describe  two  instantiations  of  revocation  schemes  in  the  Subset-Cover  framework  with  a  different 
performance  tradeoff,  as  summarized  in  table  1.2.  Each  is  defined  over  a  different  collection  of  subsets. 
Both  schemes  are  r-flexible,  namely  they  work  with  any  number  of  revocations.  In  the  first  scheme,  the 
key  assignment  is  information-theoretic  and  in  the  other  scheme  the  key  assignment  is  computational.  The 
first  method  is  natural;  the  second  method  is  more  involved,  and  exhibits  a  substantial  improvement  over 
previous  methods. 

In  both  schemes  the  subsets  and  the  partitions  are  obtained  by  imagining  the  receivers  as  the  leaves  in  a 
rooted  full  binary  tree  with  N  leaves  (assume  that  N  is  a  power  of  2).  Such  a  tree  contains  2N  -  1  nodes 
(leaves  plus  internal  nodes)  and  for  any  1  <  i  <  2N  - 1  we  assume  that  is  a  node  in  the  tree.  The  systems 
differ  in  the  collections  of  subsets  they  consider. 

3.1  The  Complete  Subtree  Method 

The  collection  of  subsets  Si, . . . ,  Sw  in  our  first  scheme  corresponds  to  all  complete  subtrees  in  the  full 
binary  tree  with  N  leaves.  For  any  node  Vi  in  the  full  binary  tree  (either  an  internal  node  or  a  leaf,  2N  -  1 
altogether)  let  the  subset  Si  be  the  collection  of  receivers  u  that  correspond  to  the  leaves  of  the  subtree  rooted 
at  node  V{.  In  other  words,  u  €  Si  iff  Vi  is  an  ancestor  of  u.  The  key  assignment  method  is  simple:  assign 
an  independent  and  random  key  Li  to  every  node  Vj  in  the  complete  tree.  Provide  every  receiver  u  with  the 
log  N  +  1  keys  associated  with  the  nodes  along  the  path  from  the  root  to  leaf  u . 

For  a  given  set  H  of  revoked  receivers,  let  uu  - . .  ,t*r  be  the  leaves  corresponding  to  the  elements  in 
H.  The  method  to  partition  Af\7l  into  disjoint  subsets  is  as  follows.  Consider  the  (directed)  Steiner  Tree 
ST (71)  defined  by  the  set  71  of  vertices  and  the  root,  i.e.  the  minimal  subtree  of  the  full  binary  tree  that 
connects  all  the  leaves  in  7Z.  ST( TVj  is  unique.  Let  5j,,  • .  • ,  Sim  be  all  the  subtrees  of  the  original  tree 
that  hang*  off  ST (71),  that  is,  all  subtrees  whose  roots  vi, . . . ,  vm  are  adjacent  to  nodes  of  outdegree  1  in 
ST(7Z),  but  they  are  not  in  ST( TV),  The  next  claim  follows  immediately  and  shows  that  this  construction  is 
indeed  a  cover,  as  required. 

Claim  1  1.  Every  leaf  u  £71  is  in  exactly  one  subset  in  the  above  partition. 

2.  A  leaf  u  G  71  does  not  belong  to  any  subset  in  the  above  partition. 

The  cover  size:  The  Steiner  tree  ST (71)  has  r  leaves.  An  internal  node  is  in  ST(7l)  iff  it  is  on  some  path 
to  a  point  in  7?,  therefore  there  are  at  most  r  log -AT  nodes  in  ST(7t).  A  finer  analysis  takes  into  account 
double  counting  of  the  nodes  closer  to  the  root  and  the  fact  that  a  node  of  outdegree  2  in  ST(7Z)  does  not 
produce  a  subset,  and  shows  that  the  number  of  subsets  is  at  most  r  log (N/r).  The  analysis  is  as  follows: 
note  that  the  number  of  sets  is  exactly  the  number  of  degree  1  nodes  in  ST(7l).  Assume  by  induction  on 
the  tree  height  that  this  is  true  for  trees  of  depth  i,  i.e.  that  in  a  subtree  with  r  leaves  the  maximum  number 
of  nodes  of  degree  1  is  at  most  r  •  (i  -  logr).  Then  consider  a  tree  of  depth  i  +  1.  If  all  the  leaves  are 
contained  in  one  subtree  of  depth  i,  then  by  induction  the  total  number  of  nodes  of  degree  1  is  at  most 
r  *  (« -  log  r)  + 1  <  r  •  (i  + 1  -  log  r).  Otherwise,  the  number  of  nodes  of  degree  1  is  the  number  of  nodes  of 
degree  1  in  the  left  subtree  (that  has  r\  >  1  leaves)  plus  the  number  of  nodes  of  degree  1  in  the  right  subtree 
(that  has  r2  >  1  leaves)  and  r  =  r\  +  r2.  By  induction,  this  is  at  most  •  (i  -  logn)  +  r2  •  (i  -  logr2)  — 
r  •  i  -  (n  logr!  +  r2  logr2)  <  r  ■  (i  +  1  -  logr)  since  (rx  log n  +  r2  logr2)  >  r(logr  -  1).  Note  that  this 
is  also  the  average  number  of  subsets  (where  the  r  leaves  are  chosen  at  random). 

The  Decryption  Step:  Given  a  message 

<fr>  •  •  •  •  *m. ELh  ( K ), El.2 (K), ..., EUm (K)],Fk(M)}) 


8 


a  receiver  u  needs  to  find  whether  any  of  its  ancestors  is  among  i  \ ,  , . . .  i m  \ note  that  there  can  be  only  one 
such  ancestor,  so  u  may  belong  to  at  most  one  subset. 

There  are  several  ways  to  facilitate  an  efficient  search  in  this  list4.  First  consider  a  generic  method  that 
works  whenever  each  receiver  is  a  member  of  relatively  few  subsets  Sc  the  values  »i,  i 2?  •  •  *  are  Put  in 
a  hash  table  and  in  addition  a  perfect  hash  function  h  of  the  list  is  transmitted  as  well  (see  [15]  for  a  recent 
survey  of  such  functions).  The  length  of  the  description  of  h  can  be  relatively  small  compared  to  the  length 
of  the  list  i.e.  it  can  be  o(mlogtu).  The  receiver  u  should  check  for  all  i  such  that  u  €  Si  whether  i  is  in  the 
list  by  computing  h(i ).  In  our  case  this  would  mean  checking  log  N  values. 

Furthermore,  suppose  that  we  are  interested  in  using  as  few  bits  as  possible  to  represent  the  collection 
of  subsets  used  {ii,  i2, . . .  im}.  The  information-theoretic  bound  on  the  number  of  bits  needed  is  [log 
which  is  roughly  m\ogw/m,  using  Stirling’s  approximation.  (Note  that  when  m  «  y/w  this  represents  a 
factor  2  compression  compared  to  storing  {n,  h,  •  •  *  *m}  explicitly.)  However  we  are  interested  in  a  succinct 
representation  of  the  collection  that  allows  efficient  lookup  in  this  list.  It  turns  out  that  with  an  additive  factor 
of  0{m  +  log  log  w)  bits  it  is  possible  to  support  an  0(1)  lookup,  see  [10,  38];  the  results  they  provide  are 
even  slightly  better,  but  this  bound  is  relatively  simple  to  achieve. 

It  turns  out  that  we  can  do  even  better  for  the  complete  subtree  method,  given  the  special  structure.  For 
each  node  u,  the  desired  ancestor  ij  in  the  list  is  the  one  with  which  u  and  ij  have  the  longest  common 
prefix.  Searching  for  this  can  be  done  by  log  log  N  comparisons  given  the  right  preprocessing  of  the  data, 
see  [33]. 

Summarizing,  in  the  complete  subtree  method  (i)  the  message  header  consists  of  at  most  r  log  —  keys 
and  encryptions  (ii)  receivers  have  to  store  log  N  keys  and  (iii)  processing  a  message  requires  0(log  log  N) 
operations  plus  a  single  decryption  operation. 

Security:  The  key  assignment  in  this  method  is  information  theoretic,  that  is  keys  are  assigned  randomly 
and  independently.  Hence  the  “key-indistinguishability”  property  of  this  method  follows  from  the  fact  that 
no  u  £  1Z  is  contained  in  any  of  the  subsets  ii,  as  stated  in  Claim  1 . 

Theorem  2  The  Complete  Subtree  Revocation  method  requires  (i)  message  length  of  at  most  r  log  y  keys 
(ii)  to  store  log  N  keys  at  a  receiver  and  (iii)  0(log  log  N)  operations  plus  a  single  decryption  operation  to 
decrypt  a  message .  Moreover,  the  method  is  secure  in  the  sense  of  Definition  11. 


Comparison  to  the  Logical  Key  Hierarchy  (LKH)  approach  :  Readers  familiar  with  the  LKH  method 
of  [42, 43]  may  find  it  instructive  to  compare  it  to  the  Complete  Subtree  Scheme.  The  main  similarity  lies  in 
the  key  assignment  -  an  independent  label  is  assigned  to  each  node  in  the  binary  tree.  However,  these  labels 
are  used  quite  differently  -  in  the  multicast  re-keying  LKH  scheme  some  of  these  labels  change  at  every 
revocation.  In  the  Complete  Subtree  method  labels  are  static;  what  changes  is  a  single  session  key. 

Consider  an  extension  of  the  LKH  scheme  which  we  call  the  Clumped  re-keying  method :  here,  r  revoca¬ 
tions  are  performed  at  a  time.  For  a  batch  of  r  revocations,  no  label  is  changed  more  than  once,  i.e.  only  the 
“latest”  value  is  transmitted  and  used.  In  thi«na»ant  the  number  of  encryptions  is  roughly  the  same  as  in  the 
Complete  Subtree  method,  but  it  requires  log  N  decryptions  at  the  user,  (as  opposed  to  a  single  decryption 
in  our  framework).  An  additional  advantage  of  the  Complete  Subtree  method  is  the  separation  of  the  labels 
and  the  session  key  which  has  a  consequence  on  the  message  length;  see  discussion  at  Section  4.1. 

4This  is  relevant  when  the  data  is  on  a  disk,  rather  then  being  broadcast,  since  broadcast  results  in  scanning  the  list  anyhow 
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Figure  2:  The  Subset  Difference  Method:  Subset  Sij  contains  all  marked  leaves  (non-black). 

3.2  The  Subset  Difference  Method 

The  main  disadvantage  of  the  Complete  Subtree  method  is  that  J\T  \  U  may  be  partitioned  into  a  number 
of  subsets  that  is  too  large.  The  goal  is  now  to  reduce  the  partition  size.  We  show  an  improved  method 
that  partitions  the  non-revoked  receivers  into  at  most  2r  -  1  subsets  (or  1.25r  on  average),  thus  getting  rid 
of  a  log  iNT  factor  and  effectively  reducing  the  message  length  accordingly.  In  return,  the  number  of  keys 
stored  by  each  receiver  increases  by  a  factor  of  \  •  log  AT.  The  key  characteristic  of  the  Subset-Difference 
method,  which  essentially  leads  to  the  reduction  in  message  length,  is  that  in  this  method  any  user  belongs 
to  substantially  more  subsets  than  in  the  first  method  ( 0(N )  instead  of  log  AT).  The  challenge  is  then  to 
devise  an  efficient  procedure  to  succinctly  encode  this  large  set  of  keys  at  the  user. 

The  subset  description 

As  in  the  previous  method,  the  receivers  are  viewed  as  leaves  in  a  complete  binary  tree.  The  collection 
of  subsets  Si, . . . ,  Sw  defined  by  this  algorithm  corresponds  to  subsets  of  the  form  “a  group  of  receivers 
G\  minus  another  group  G<i  \  where  G%  C  G\.  The  two  groups  Gi,G2  correspond  to  leaves  in  two  full 
binary  subtrees.  Therefore  a  valid  subset  S  is  represented  by  two  nodes  in  the  tree  (v*,  Vj)  such  that  V{  is  an 
ancestor  of  Vj.  We  denote  such  subset  as  Sij'  A  leaf  u  is  in  &ij  iff  it  is  in  the  subtree  rooted  at  V{  but  not  in 
the  subtree  rooted  at  Vj,  or  in  other  words  u  G  Sij  iff  Vi  is  an  ancestor  of  u  but  Vj  is  not.  Figure  2  depicts 
Sij •  Note  that  all  subsets  from  the  Complete  Subtree  Method  are  also  subsets  of  the  Subset  Difference 
Method;  specifically,  a  subtree  appears  here  as  the  difference  between  its  parent  and  its  sibling.  The  only 
exception  is  the  full  tree  itself,  and  we  will  add  a  special  subset  for  that.  We  postpone  the  description  of  the 
key  assignment  till  later;  for  the  time  being  assume  that  each  subset  has  an  associated  key  LXJ-. 

The  Cover 

For  a  given  set  H  of  revoked  receivers,  let  u\, . . . ,  uT  be  the  leaves  corresponding  to  the  elements  in  K. 
The  Cover  is  a  collection  of  disjoint  subsets  j, ,  Si2)  j2 . . . ,  Simjm  which  partitions  Below  is  an 
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algorithm  for  finding  the  cover,  and  an  analysis  of  its  size  (number  of  subsets). 

Finding  the  Cover:  The  method  partitions  Af\'R  into  disjoint  subsets  jl ,  Si2j2 Sim%jm  as  follows: 
let  ST(U)  be  the  (directed)  Steiner  Tree  induced  by  U  and  the  root.  We  build  the  subsets  collection  itera¬ 
tively,  maintaining  a  tree  T  which  is  a  subtree  of  ST(U)  with  the  property  that  any  u  6  H  \  K  that  is  below 
a  leaf  of  T  has  been  covered.  We  start  by  making  T  be  equal  to  ST{K)  and  then  iteratively  remove  nodes 
from  T  (while  adding  subsets  to  the  collection)  until  T  consists  of  just  a  single  node: 

1.  Find  two  leaves  and  Vj  in  T  such  that  the  least-common-ancestor  v  of  Vi  and  vj  does  not  contain 
any  other  leaf  of  T  in  its  subtree.  Let  vi  and  vk  be  the  two  children  of  v  such  that  Vi  a  descendant  of 
vi  and  vj  a  descendant  of  vk.  (If  there  is  only  one  leaf  left,  make  Vi  =  Vj  to  the  leaf,  v  to  be  the  root 
ofT  and«{  =  vk  =  v.) 

2.  If  vi  £  Vi  then  add  the  subset  5^  to  the  collection;  likewise,  if  vk  £  Vj  add  the  subset  Sk,j  to  the 
collection. 

3.  Remove  from  T  all  the  descendants  of  v  and  make  it  a  leaf. 

An  alternative  description  of  the  cover  algorithm  is  as  follows:  consider  maximal  chains  of  nodes  with 
outdegree  1  in  ST(TZ).  More  precisely,  each  such  chain  is  of  the  form  .  ■  •  v^}  where  (i)  all  of 

vil,vi2,...vit_1  have  outdegree  1  in  ST(Tl)  (ii)  vi(  is  either  a  leaf  or  a  node  with  outdegree  2  and  (iii)  the 
parent  of  Vj,  is  either  a  node  of  outdegree  2  or  the  root.  For  each  such  chain  where  l  >  2  add  a  subsets  S%2  ,H 
to  the  cover.  Note  that  all  nodes  of  outdegree  1  in  ST(H)  are  members  of  precisely  one  such  chain. 

The  cover  size:  Lemma  3  shows  that  a  cover  can  contain  at  most  2r  - 1  subsets  for  any  set  of  r  revocations. 
Furthermore,  if  the  set  of  revoked  leaves  is  random ,  then  the  average  number  of  subsets  in  a  cover  is  1.25r. 

Lemma  3  Given  any  set  of  revoked  leaves  U,  the  above  method  partitions  Af\Tl  into  at  most  2r  - 1  disjoint 
subsets. 

Proof:  Every  iteration  increases  the  number  of  subsets  by  at  most  two  (in  step  2)  and  reduces  the  number  of 
the  Steiner  leaves  by  one  (in  Step  3),  except  the  last  iteration  that  may  not  reduce  the  number  of  leaves  but 
adds  only  one  subset.  Starting  with  r  leaves,  the  process  generates  the  total  of  2r  -  1  subsets.  Moreover, 
every  non-revoked  u  is  in  exactly  one  subset,  the  one  defined  by  the  first  chain  of  nodes  of  outdegree  1  in 
ST{R)  that  is  encountered  while  moving  from  u  towards  the  root.  This  encounter  must  hit  a  non-empty 
chain,  since  the  path  from  u  to  the  root  cannot  join  ST(R)  in  an  outdegree  2  node,  since  this  implies  that 
uen.  D 

The  next  lemma  is  concerned  with  covering  more  general  sets  than  those  obtained  by  removing  users. 
Rather  it  assumes  that  we  are  removing  a  collection  of  subsets  from  the  Subset  Difference  collection.  It  is 
applied  later  in  Sections  4.2  and  5.2. 

Lemma  4  Let  S  =  Sil5  S,2,  ...Sim  be  a  collection  of  m  disjoint  subsets  from  the  underlying  collection 
defined  by  the  Subset  Difference  method,  andU  =  U  f-iSi}.  Then  the  leaves  in  Jf\U  can  be  covered  by  at 
most  3m  -  1  subsets  jrom  the  underlying  Subset  Difference  collection. 

Proof:  The  proof  is  by  induction  on  m.  When  m  =  1,  S  contains  a  single  set.  Let  this  set  be  S which  is 
the  set  that  is  represented  by  two  nodes  in  the  tree  {va,  vj).  Denote  by  vc  and  vc>  the  parent  and  the  sibling 
of  vb  respectively  (it  is  possible  that  va  =  vc),  and  by  r  the  root  of  the  tree.  Then  the  leaves  in  JV  \  U  are 
covered  by  the  following  two  sets  ST,a  and  SC)Ci.  If  va  =  vc  then  the  cover  consists  of  a  single  set,  STtC>. 

To  handle  the  case  where  m  >  1,  we  need  the  following  definition.  We  say  that  a  set  SXi y  is  nested 
within  the  set  Sa>b  if  the  tree  node  vx  is  contained  in  the  subtree  rooted  at  vb.  Note  that  if  two  subsets  Sa,b 
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and  Sa'fl  are  disjoint  but  not  nested  then  the  subtrees  rooted  at  va  and  va»  must  be  disjoint5.  Consider  the 
following  two  cases: 

1.  All  sets  in  S  are  maximal  with  respect  to  the  nesting  property.  Let  S ^  =  Sajibj  be  the  jth  set  in  5.  A 
cover  for  M  \  U  is  constructed  by  first  covering  all  the  subtrees  rooted  at  the  v^.  *s,  and  then  covering 
the  rest  of  the  leaves  that  are  not  contained  in  any  one  of  the  subtrees  rooted  at  va. .  That  is,  for  each 
set  Saj^j  in  S,  construct  the  set  SC)(j  where  vc  and  are  the  parent  and  the  sibling  of  respectively 
for  the  total  of  m  sets.  To  cover  the  rest,  treat  the  nodes  vai , . . . ,  vam  as  m  revoked  leaves  and  apply 
Lemma  3  to  cover  this  tree.  This  requires  2m  -  1  additional  sets,  hence  the  number  of  sets  required 
to  cover  M\U  in  this  case  is  3m  -  1. 

2.  S  -  S\  U  52  such  that  \S\\  =  fc  >  1  and  there  exists  a  maximal  set  Saib  €  S2  with  respect  to 
the  nesting  property  such  that  all  sets  in  S\  are  nested  within  Sa^b-  Let  IA*  be  the  subtree  rooted  at 
vb.  The  idea  is  to  first  cover  Af\(U  U  W)  and  then  cover  the  leaves  in  U'  \  U.  The  first  part  of 
the  cover  can  be  obtained  by  applying  the  lemma  recursively  with  the  input  £2,  and  the  second  part 
by  applying  it  recursively  with  S\.  By  the  induction  hypothesis,  this  requires  the  total  number  of 
3(m  —  A:)  —  1  3fc  —  l  =  3m-2  sets. 


□ 

Average-case  analysis:  The  analysis  of  Lemma  3  is  a  worst-case  analysis  and  there  are  instances  which 
achieve  actually  require  2r  —  1  sets.  However,  it  is  a  bit  pessimistic  in  the  sense  that  it  ignores  the  fact  that 
a  chain  of  nodes  of  outdegree  1  in  ST{%)  may  consist  only  of  the  end  point,  in  which  case  no  subset  is 
generated.  This  corresponds  to  the  case  where  vi  =  Vj  or  vr  =  Vj  in  Step  2.  Suppose  that  the  revoked  set 
is  selected  at  random  from  all  subsets  of  cardinality  r  of  J\f,  then  what  is  the  expected  number  of  subsets 
generated?  The  question  is  how  many  outdegree  1  chains  are  empty  (i.e.  contain  only  one  point).  We  can 
bound  it  from  above  as  follows:  consider  any  chain  for  which  it  is  known  that  there  are  k  members  beneath 
it.  Then  the  probability  that  the  chain  is  not  empty  is  at  most  2~(*"1).  For  any  1  <  k  <  r  there  can  be  at 
most  r/k  chains  such  that  there  are  k  leaves  beneath  it,  since  no  such  chain  can  be  ancestor  of  another  chain 
with  k  descendants.  Therefore  the  expected  number  of  non-empty  chains  is  bounded  by 

r  r  1  00  1  1 

Efc-2*rT<2''Efc-2*<21n2T«1.38.r. 

fc— I  1 

Simulation  experiments  have  shown  a  tighter  bound  of  1.25r  for  the  random  case.  So  the  actual  number  of 
subsets  used  by  the  Subset  Difference  scheme  is  expected  to  be  slightly  lower  than  the  2r  —  1  worst  case 
result. 

Key  assignment  to  the  subsets 

We  now  define  what  information  each  receiver  must  store.  If  we  try  and  repeat  the  information-theoretic 
approach  of  the  previous  scheme  where  each  receiver  needs  to  store  explicitly  the  keys  of  all  the  subsets  it 
belongs  to,  the  storage  requirements  would  expand  tremendously:  consider  a  receiver  it;  for  each  complete 
subtree  Tk  it  belongs  to,  u  must  store  a  number  of  keys  proportional  to  the  number  of  nodes  in  the  subtree 
Tk  that  are  not  on  the  path  from  the  root  of  Tk  to  it.  There  are  log  AT  such  trees,  one  for  each  height 
1  <  fc  <  log  jV,  yielding  a  total  of  Efcjf  (2*  -  fc)  which  is  O(N)  keys. 

The  only  exception  is  the  case  where  6  and  b '  are  siblings  and  are  both  children  of  a.  This  is  a  degenerate  case,  and  the  two 
subsets  should  be  replaced  by  a  new  subset  50]0/ 
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We  therefore  devise  a  key  assignment  method  that  requires  a  receiver  to  store  only  0  (log  TV)  keys  per 
subtree,  for  the  total  of  0(log2  TV)  keys. 

While  the  total  number  of  subsets  to  which  a  user  u  belongs  is  O(TV),  these  can  be  grouped  into  log  TV 
clusters  defined  by  the  first  subset  i  (from  which  another  subsets  is  subtracted).  The  way  we  proceed  with 
the  keys  assignment  is  to  choose  for  each  1  <  i  <  TV  corresponding  to  an  internal  node  in  the  full  binary  tree 
a  random  and  independent  value  LABEL*.  This  value  should  induce  the  keys  for  all  legitimate  subsets  of 
the  form  Sij.  The  idea  is  to  employ  the  method  used  by  Goldreich,  Goldwasser  and  Micali  [28]  to  construct 
pseudo-random  functions,  which  was  also  used  by  Fiat  and  Naor  [23]  for  purposes  similar  to  ours. 

Let  G  be  a  (cryptographic)  pseudo-random  sequence  generator  (see  definition  below)  that  triples 
the  input,  i.e.  whose  output  length  is  three  times  the  length  of  the  input;  let  Gl{S)  denote  the  left  third  of 
the  output  of  G  on  seed  S,  Gr{S)  the  right  third  and  Gm{&)  the  middle  third.  We  say  that  G  :  {0,  l}n 
{0,  l}3n  is  a  pseudo-random  sequence  generator  if  no  polynomial-time  adversary  can  distinguish  the  output 
of  G  on  a  randomly  chosen  seed  from  a  truly  random  string  of  similar  length.  Let  £4  denote  the  bound  on 
the  distinguishing  probability. 

Consider  now  the  subtree  T*  (rooted  at  t/*).  We  will  use  the  following  top-down  labeling  process:  the 
root  is  assigned  a  label  LABEL*.  Given  that  a  parent  was  labeled  S,  its  two  children  are  labeled  Gi(S) 
and  Gr(S)  respectively.  Let  LABEL*  j  be  the  label  of  node  Vj  derived  in  the  subtree  T*  from  LABEL*. 
Following  such  a  labeling,  the  key  Ly  assigned  to  set  Si,j  is  Gm  of  LABEL*}j.  Note  that  each  label  induces 
three  parts:  Gl  -  the  label  for  the  left  child,  Gr  -  the  label  for  the  right  child,  and  Gm  the  key  at  the  node. 
The  process  of  generating  labels  and  keys  for  a  particular  subtree  is  depicted  in  Figure  3.  For  such  a  labeling 
process,  given  the  label  of  a  node  it  is  possible  to  compute  the  labels  (and  keys)  of  all  its  descendants.  On  the 
other  hand,  without  receiving  the  label  of  an  ancestor  of  a  node,  its  label  is  pseudo-random  and  for  a  node  j, 
given  the  labels  of  all  its  descendants  (but  not  including  itself)  the  key  L*j  is  pseudo-random  (LABEL*j, 
the  label  of  vj ,  is  not  pseudo-random  given  this  information  simply  because  one  can  check  for  consistency 
of  the  labels).  It  is  important  to  note  that  given  LABEL*,  computing  L*  j  requires  at  most  log  TV  invocations 
of  G. 

We  now  describe  the  information  Iu  that  each  receiver  u  gets  in  order  to  derive  the  key  assignment 
described  above.  For  each  subtree  T*  such  that  u  is  a  leaf  of  T*  the  receiver  u  should  be  able  to  compute  L*}j 
iff  j  is  not  an  ancestor  of  u .  Consider  the  path  from  v*  to  u  and  let  % ,  u*2 , . . .  v*fc  be  the  nodes  just  “hanging 
off”  the  path,  i.e.  they  are  adjacent  to  the  path  but  not  ancestors  of  u  (see  Figure  3).  Each  j  in  T*  that  is  not 
an  ancestor  of  u  is  a  descendant  of  one  of  these  nodes.  Therefore  if  u  receives  the  labels  of  v *j ,  u*2, . . .  vik 
as  part  of  Iu,  then  invoking  G  at  most  log  TV  times  suffices  to  compute  L*j  for  any  j  that  is  not  an  ancestor 
of  u. 

As  for  the  total  number  of  keys  (in  fact,  labels)  stored  by  receiver  u ,  each  tree  T*  of  depth  k  that  contains 
u  contributes  k-  1  keys  (plus  one  key  for  the  case  where  there  are  no  revocations),  so  the  total  is 


log  7V+1 

1+  k  • 


1  =  1  + 


(log  N  +  1)  log  N 


i  log2  iV  +  i  log  JV  +  1 
z  z 


Decryption  Step:  At  decryption  time,  a  receiver  u  first  finds  the  subset  3  such  that  u  £  and 
computes  the  key  corresponding  to  L*j.  Using  the  techniques  described  in  the  complete  subtree  method  for 
table  lookup  structure,  this  subset  can  be  found  in  O(loglogTV).  The  evaluation  of  the  subset  key  takes  now 
at  most  log  TV  applications  of  a  pseudo-random  generator.  After  that,  a  single  decryption  is  needed. 


13 


S=LABELI 
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LABEL!  j  =  G_R(G_L(G_L(LABEU ))) 
Li,j=  G_M  (LABELij) 


LABEL! 


Figure  3:  Key  Assignment  in  the  Subset  Difference  Method.  Left:  generation  ofLABEL^-  and  the  key  Litj. 
Right:  leaf  u  receives  the  labels  of  Vix , . . .  V{k  that  are  induced  by  the  label  LABEL j  of  v*. 

Security 

In  order  to  prove  security  we  show  that  the  key-indistinguishability  condition  (Definition  9  of  Section  6) 
holds  for  this  method,  namely  that  each  key  is  indistinguishable  from  a  random  key  for  all  users  not  in 
the  corresponding  subset.  Theorem  12  of  Section  6  proves  that  this  condition  implies  the  security  of  the 
algorithm. 

Observe  first  that  for  any  u  E  AT,  u  never  receives  keys  that  correspond  to  subtrees  to  which  it  does 
not  belong.  Let  Si  denote  the  set  of  leaves  in  the  subtree  Ti  rooted  at  For  any  set  Sij  the  key  Ly 
is  (information  theoretically)  independent  of  alt  Iu  for  u  0  Si.  Therefore  we  have  to  consider  only  the 
combined  secret  information  of  all  u  E  Sj.  This  is  specified  by  at  most  log  JV  labels  -  those  hanging  on 
the  path  from  Vi  to  Vj  plus  the  two  children  of  Vj  -  which  are  sufficient  to  derive  all  other  labels  in  the 
combined  secret  information.  Note  that  these  labels  are  log  AT  strings  that  were  generated  independently  by 
Gf  namely  it  is  never  the  case  that  one  string  is  derived  from  another.  Hence,  a  hybrid  argument  implies 
that  the  probability  of  distinguishing  L^j  from  random  can  be  at  most  £4/ log  N ,  where  £4  is  the  bound  on 
distinguishing  outputs  of  G  from  random  strings. 

Theorem  5  The  Subset  Difference  method  requires  (i)  message  length  of  at  most  2r  —  1  keys  (ii)  to  store 
~  log2  N  -f  \  log  N  +  1  keys  at  a  receiver  and  (iii)  0(log  N)  operations  plus  a  single  decryption  operation 
to  decrypt  a  message.  Moreover,  the  method  is  secure  in  the  sense  of  Definition  11. 
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3.3  Lower  Bounds 

Generic  lower  bound 

Any  ciphertext  in  a  revocation  system  when  r  users  are  revoked  should  clearly  encode  the  original  message 
plus  the  revoked  subset,  since  it  is  possible  to  test  which  users  decrypt  correctly  and  which  incorrectly 
using  the  preassigned  secret  information  only  (that  was  chosen  independently  of  the  transmitted  message). 
Therefore  we  have  a  “generic”  lower  bound  of  log  (^f)  w  r  log  N  bits  on  the  length  of  the  header  (or  extra 
bits).  Note  that  the  subset  difference  method  approaches  this  bound  -  the  number  of  extra  bits  there  is 
0(r*  key-size). 

Lower  bounds  for  the  information-theoretic  case 

If  the  keys  to  all  the  subsets  are  chosen  independently  (and  hence  u  explicitly  receives  in  Iu  all  L*  such  that 
u  e  Si)  then  Luby  and  Staddon’s  lower  bound  for  the  “Or  Protocol”  [31]  can  be  applied.  They  used  the 
Sunflower  Lemma  (see  below)  to  show  that  any  scheme  which  employs  m  subsets  to  revoke  r  users  must 

have  at  least  one  member  with  at  least  . keys.  This  means  that  if  we  want  the  number  of  subsets  m 

to  be  at  most  r,  then  the  receivers  should  store  at  least  tt(N/r3)  keys  (as  (^)  >  ).  In  the  case  where 

r  <C  Nf  our  (non-information-theoretic)  Subset  Difference  method  does  better  than  this  lower  bound. 

Note  that  when  the  number  of  subsets  used  in  a  broadcast  is  0(r  log  AT)  (as  it  is  in  the  Complete  Sub¬ 
tree  method)  then  the  above  bound  becomes  useless.  We  now  show  that  even  if  one  is  willing  to  use  this 
many  subsets  (or  even  more),  then  at  least  fi(logiV)  keys  should  be  stored  by  the  receivers.  We  recall  the 
Sunflower  Lemma  of  Erdos  and  Rado  (see  [21]). 

Definition  6  Let  Su  S2,  •  •  ■  ,  St  be  subsets  of  some  underlying  finite  ground  set  We  say  that  they  are  a 
sunflower  if  the  intersections  of  any  pair  of  the  subsets  are  equal,  in  other  words,  for  all  1  <  i  <  j  <  i  we 
have  Si  fi  Sj  —  flf=i  Si. 

The  Sunflower  Lemma  says  that  in  every  set  system  there  exists  a  sufficiently  large  sunflower:  in  a  collection 
of  N  subsets  each  of  size  at  most  k  there  exists  a  sunflower  consisting  of  at  least  subsets. 

Consider  now  the  sets  Ti,T2, . .  .TN  of  keys  the  receivers  store.  I.e.  Tu  =  {Li\u  €  Si}.  If  for  all  u  we 
have  that  \TU\  <  k,  then  there  exists  a  sunflower  of  subsets.  Pick  one  u  such  that  Tu  is  in  the  sunflower 
and  make  7 Z=  {u}.  This  means  that  in  order  to  cover  the  other  members  of  the  sunflower  we  must  use  at 
least  -  1  keys,  since  no  Si  can  be  used  to  cover  two  of  the  other  members  of  the  sunflower  (otherwise 
Si  must  also  have  the  revoked  it  as  a  member).  This  means,  for  instance,  that  if  k  —  yTogiV  then  just  to 

revoke  a  single  user  requires  using  at  least  1  subsets. 


4  Further  Discussions 

4.1  Implementation  Issues 
Implementing  El  and  Fk 

One  of  the  issues  that  arises  in  implementing  a  Subset-Cover  scheme  is  how  to  implement  the  two  crypto¬ 
graphic  primitives  El  and  Fk.  The  basic  requirements  from  El  and  Fk  were  outlined  above  in  Section 
2.  However,  it  is  sometimes  desirable  to  chose  an  encryption  F  that  might  be  weaker  (uses  shorter  keys) 
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than  the  encryption  chosen  for  E.  The  motivation  for  that  is  twofold:  (1)  to  speedup  the  decoding  process  at 
the  receiver  (2)  to  shorten  the  length  of  the  header.  Such  a  strategy  makes  sense,  for  example,  for  copyright 
protection  purposes.  There  it  may  not  make  sense  to  protect  a  specific  ciphertext  so  that  breaking  it  is  very 
expensive;  on  the  other  hand  we  do  want  to  protect  the  long  lived  keys  of  the  system  with  a  strong  encryption 
scheme. 

Suppose  that  F  is  implemented  by  using  a  stream  cipher  with  a  long  key,  but  sending  some  of  its  bits 
in  the  clear;  thus  K  corresponds  to  the  hidden  part  of  the  key  and  this  is  the  only  part  that  needs  to  be 
encrypted  in  the  header.  (One  reason  to  use  F  in  such  a  mode  rather  than  simply  using  a  method  designed 
with  a  small  key  is  to  prevent  a  preprocessing  attack  against  F.)  This  in  itself  does  not  shorten  the  header, 
since  it  depends  on  the  block-length  of  E  (assuming  E  is  implemented  by  block-cipher).  We  now  provide 
a  specification  for  using  E,  called  Prefix-Truncation,  which  reduces  the  header  length  as  well,  in  addition 
to  achieving  speedup,  without  sacrificing  the  security  of  the  long-lived  keys.  Let  Prefixes  denote  the  first 
i  bits  of  a  string  S.  Let  El  be  a  block  cipher  and  U  be  a  random  string  whose  length  is  the  length  of 
the  block  of  EL.  Let  if  be  a  relatively  short  key  for  the  cipher  Fk  (whose  length  is,  say,  56  bits).  Then, 
[Prefix|K |£k(W)]  ©  K  provides  an  encryption  that  satisfies  the  definition  of  E  as  described  in  Section  6. 
The  Prefix-Truncated  header  is  therefore: 


([  *i , *2,  •  •  • , *m, U,  [Prefix, *,25^  {U)\  ®  K, . . . ,  (Prefix,  ELim (U)]  ®K),Fk{M)) 

Note  that  this  reduces  the  length  of  the  header  down  to  about  m  x  \K\  bits  long  (say  56m)  instead  of 
m  x  |L|.  In  the  case  where  the  key  length  of  E  is  marginal,  then  the  following  heuristic  can  be  used  to 
remove  the  factor  m  advantage  that  the  adversary  has  in  a  brute-force  attack  which  results  from  encrypting 
the  same  string  U  with  m  different  keys.  Instead,  encrypt  the  string  U  ©  ij ,  namely 


{[  [Prefixj^i^^  (U  ©  ix)]  ©  if, ... ,  [Prefix] K\^Lim  {K  ©  im)]  ©  K  ],Fk(M)) 

All-Or-Nothing  Encryptions  for  Fk 

As  before,  we  can  imagine  cases  where  the  key  used  by  Fk  is  only  marginally  long  enough.  Moreover,  in  a 
typical  scenario  like  copyright  protection,  the  message  M  is  long  (e.g.  M  may  be  a  title  on  a  CD  or  a  DVD 
track).  In  such  cases,  it  is  possible  to  extract  more  security  from  the  long  message  for  a  fixed  number  of  key 
bits  using  the  All-Or-Nothing  encryption  mode  originally  suggested  by  [40].  These  techniques  assure  that 
the  entire  ciphertext  must  be  decrypted  before  even  a  single  message  block  can  be  determined.  The  concrete 
method  of  [40]  results  in  a  penalty  of  a  factor  of  three  in  the  numbers  encryptions/decryptions  required  by 
a  legitimate  user;  however,  for  a  long  message  that  is  composed  of  /  blocks,  a  brute-force  attack  requires  a 
factor  of  l  more  time  than  a  similar  attack  would  require  otherwise.  Other  All-Or-Nothing  methods  can  be 
applied  as  well. 

The  drawback  of  using  an  All-Or-Nothing  mode  is  its  latency ,  namely  the  entire  message  M  must  be 
decoded  before  the  first  block  of  plaintext  is  known.  This  makes  the  technique  unusable  for  applications  that 
cannot  tolerate  such  latency. 

Frequently  Refreshed  Session  Keys 

Suppose  that  we  want  to  prevent  an  illegal  redistribution  channel  that  will  use  some  low  bandwidth  means 
to  send  K ,  the  session  key  (a  low  bandwidth  label  or  a  bootlegged  CD).  A  natural  approach  to  combat  such 
channel  is  to  encode  different  parts  of  the  message  M  with  different  session  keys,  and  to  send  all  different 
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session  keys  encrypted  with  all  the  subset  keys.  That  is,  send  l  >  1  different  session  keys  all  encrypted  with 
the  same  cover,  thus  increasing  the  length  of  the  header  by  a  factor  of  L  This  means  that  in  order  to  have 
only  a  modest  increase  in  the  header  information  it  is  important  that  m,  the  number  of  subsets,  will  be  as 
small  as  possible.  Note  that  the  number  of  decryptions  that  the  receiver  needs  to  perform  in  order  to  obtain 
its  key  Cy  which  is  used  in  this  cover  remains  one. 

Storage  at  the  Center 

In  both  the  Complete  Subtree  and  Subset  Difference  methods,  a  unique  label  is  associated  with  each  node 
in  the  tree.  Storing  these  labels  explicitly  at  the  Center  can  become  a  serious  constraint.  However,  these 
labels  can  be  generated  at  the  center  by  applying  a  pseudo-random  function  on  the  name  of  the  node  without 
affecting  the  security  of  the  scheme.  This  reduces  the  storage  required  by  at  the  Center  to  the  single  key  of 
the  pseudo-random  function. 

Furthermore,  it  may  be  desirable  to  distribute  the  center  between  several  servers  with  the  objective  of 
avoiding  a  single  or  few  points  of  attack.  In  such  a  case  the  distributed  pseudo-random  functions  of  [36] 
may  be  used  to  define  the  labels. 

4.2  Hierarchical  Revocation 

Suppose  that  the  receivers  are  grouped  in  a  hierarchical  manner,  and  that  it  is  desirable  to  revoke  a  group 
that  consists  of  the  subordinates  of  some  entity,  without  paying  a  price  proportional  to  the  group  size  (for 
instance  all  the  players  of  a  certain  manufacturer).  Both  methods  of  Section  3  lend  themselves  to  hierarchical 
revocation  naturally,  given  the  tree  structure.  If  the  hierarchy  corresponds  to  the  tree  employed  by  the 
methods,  then  to  revoke  the  receivers  below  a  certain  node  counts  as  just  a  single  user  revocation. 

By  applying  Lemma  4  we  get  that  in  the  Subset  Difference  Method  we  can  remove  any  collection  of  m 
subsets  and  cover  the  rest  with  3m  —  1  subsets.  Hence,  the  hierarchical  revocation  can  be  performed  by  first 
constructing  m  sets  that  cover  all  revoked  devices,  and  then  covering  all  the  rest  with  3m  -  1,  yielding  the 
total  of  4m  sets. 

4.3  Public  Key  methods 

In  some  scenarios  it  is  desireable  to  use  a  revocation  scheme  in  a  public-key  mode,  i.e.  when  the  party  that 
generates  the  ciphertext  is  not  necessarily  trustworthy  and  should  not  have  access  to  the  decryption  keys 
of  the  users,  or  when  ciphertexts  may  be  generated  by  a  number  of  parties.  Any  Subset-Cover  revocation 
algorithm  can  be  used  in  this  mode:  the  Center  (a  trusted  entity)  generates  the  private-keys  corresponding  to 
the  subsets  and  hands  each  user  the  private  keys  it  needs  for  decryption.  The  (non  necessarily  trusted)  party 
that  generates  the  ciphertext  should  only  have  access  to  public-keys  corresponding  to  the  subsets  which  we 
call  “the  public-key  file”.  That  is,  E  is  a  public  key  cryptosystem  whereas  F  is  as  before.  In  principal,  any 
public  key  encryption  scheme  with  sufficient  security  can  be  used  for  E.  However,  not  all  yield  a  system 
with  a  reasonable  efficiency.  Below  we  discuss  the  problems  involved,  and  show  that  a  Diffie-Hellman  type 
scheme  best  serves  this  mode. 

Public  Key  Generation:  Recall  that  the  Subtree  Difference  method  requires  that  subset  keys  are  derived 
from  labels.  If  used  in  a  public-key  mode,  the  derivation  yields  random  bits  that  are  then  used  to  generate 
the  private/public  key  pair.  For  example,  if  RSA  keys  are  used,  then  the  random  strings  that  are  generated 
by  the  Pseudo  Random  Generator  G  can  be  used  as  the  random  bits  which  are  input  to  the  procedure  which 
generates  an  RSA  key.  However,  this  is  rather  complicated,  both  in  terms  of  the  bits  and  time  needed. 
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Therefore,  whenever  the  key  assignment  is  not  information-theoretic  it  is  important  to  use  a  public-key 
scheme  where  the  mapping  from  random  bits  to  the  keys  is  efficient.  This  is  the  case  with  the  Diffie-Hellman 
scheme. 

Size  of  Public  Key  File:  The  problem  is  that  the  public  key  file  might  be  large,  proportional  to  w,  the 
number  of  subsets.  In  the  Complete  Subtree  method  w  =  2N  -1  and  in  the  Subtree  Difference  method  it 
is  N  log  N .  An  interesting  open  problem  is  to  come  up  with  a  public-key  cryptosystem  where  it  is  possible 
to  compress  the  public-keys  to  a  more  manageable  size.  For  instance,  an  identity-based  cryptosystem  would 
be  helpful  for  the  information-theoretic  case  where  keys  are  assigned  independently.  A  recent  proposal  that 
fits  this  requirement  is  [8]. 

Prefix-Truncated  Headers:  We  would  like  to  use  the  Prefix-Truncation,  described  in  Section  4.1,  with 
public-key  cryptosystem  to  reduce  the  header  size  without  sacrificing  security  of  long-term  keys.  It  can  not 
be  employed  with  an  arbitrary  public  key  cryptosystem  (e.g.  RSA).  However,  a  Diffie-Hellman  public  key 
system  which  can  be  used  for  the  Prefix-Truncation  technique  can  be  devised  in  the  following  manner. 

Let  G  be  a  group  with  a  generator  g  and  let  the  subset  keys  be  L\  =  yi ,  =  2/2,  • . . ,  Lw  =  yw  elements 

in  G.  Let  gyi,gV2, . . .  ,gVw  be  their  corresponding  public  keys.  Define  h  as  a  pairwise-independent  function 
h  :  G  H-  {0, that  maps  elements  which  are  randomly  distributed  over  G  to  randomly  distributed  strings 
of  the  desired  length  (see  e.g.  [37]  for  a  discussion  of  such  functions).  Given  the  subsets  , . . . ,  Sim  to  be 
used  in  the  header,  the  encryption  E  can  be  done  by  picking  a  new  element  x  from  G,  publicizing  gx ,  and 
encrypting  K  as  El t  (K)  =  h(gxyi j  )  ©  K.  That  is,  the  header  now  becomes 

([  ii,  «2,  - . . ,  «mi  9X,  h,  h(g: )  ©  K, . . . ,  )  ©  K  ]?  FK(M)) 

Interestingly,  in  terms  of  the  broadcast  length  such  system  hardly  increases  the  number  of  bits  in  the 
header  as  compared  with  a  shared-key  system  -  the  only  difference  is  gx  and  the  description  of  h.  Therefore 
this  difference  is  fixed  and  does  not  grow  with  the  number  of  revocations.  Note  however  that  the  scheme 
as  defined  above  is  not  immune  to  chosen-ciphertext  attacks,  but  only  to  chosen  plaintext  ones.  Coming  up 
with  public-key  schemes  where  prefix-truncation  is  possible  that  are  immune  to  chosen  ciphertext  attacks  of 
either  kind  is  an  interesting  challenge6 . 

4.4  Applications  to  Multicast 

The  difference  between  key  management  for  the  scenario  considered  in  this  paper  and  for  the  Logical  Key 
Hierarchy  for  multicast  is  that  in  the  latter  the  users  (i.e.  receivers)  may  update  their  keys  [43,  42].  This 
update  is  referred  to  as  a  re-keying  event  and  it  requires  all  users  to  be  connected  during  this  event  and 
change  their  internal  state  (keys)  accordingly.  However,  even  in  the  multicast  scenario  it  is  not  reasonable 
to  assume  that  all  the  users  receive  all  the  messages  and  perform  the  required  update.  Therefore  some 
mechanism  that  allows  individual  update  must  be  in  place.  Taking  the  stateless  approach  gets  rid  of  the 
need  for  such  a  mechanism:  simply  add  a  header  to  each  message  denoting  who  are  the  legitimate  recipients 
by  revoking  those  who  should  not  receive  it.  In  case  the  number  of  revocations  is  not  too  large  this  may 
yield  a  more  manageable  solution.  This  is  especially  relevant  when  there  is  a  single  source  for  the  sending 
messages  or  when  public-keys  are  used. 

Backward  secrecy:  Note  that  revocation  in  itself  lacks  backward  secrecy  in  the  following  sense:  a  con¬ 
stantly  listening  user  that  has  been  revoked  from  the  system  records  all  future  transmission  (which  it  can’t 
decrypt  anymore)  and  keeps  all  ciphertexts.  At  a  later  point  it  gains  a  valid  new  key  (by  re-registering) 

6Both  the  scheme  of  Cramer  and  Shoup  f  14]  and  the  random  oracle  based  scheme  [25]  require  some  specific  information  for 
each  recipient;  a  possible  approach  with  random  oracles  is  to  add  a  zk  proof-of-knowledge  for  x. 
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which  allows  decryption  of  all  past  communication.  Hence,  a  newly  acquired  user-key  can  be  used  to  de¬ 
crypt  all  past  session  keys  and  ciphertexts.  The  way  that  [43,  42]  propose  to  achieve  backward  secrecy  is 
to  perform  re-keying  when  new  users  are  added  to  the  group  (such  a  re-keying  may  be  reduced  to  only  one 
way  chaining,  known  as  LKH+),  thus  making  such  operations  non-trivial.  We  point  out  that  in  the  subset- 
cover  framework  and  especially  in  the  two  methods  we  proposed  it  may  be  easier:  At  any  given  point  of  the 
system  include  in  the  set  of  revoked  receivers  all  identities  that  have  not  been  assigned  yet.  As  a  result,  a 
newly  assigned  user-key  cannot  help  in  decrypting  an  earlier  ciphertext.  Note  that  this  is  feasible  since  we 
assume  that  new  users  are  assigned  keys  in  a  consecutive  order  of  the  leaves  in  the  tree,  so  unassigned  keys 
are  consecutive  leaves  in  the  complete  tree  and  can  be  covered  by  at  most  log  N  sets  (of  either  type,  the 
Complete-Subtree  method  or  the  Subtree-Difference  method).  Hence,  the  unassigned  leaves  can  be  treated 
with  the  hierarchical  revocation  technique,  resulting  in  adding  at  most  log  N  revocations  to  the  message. 

4.5  Comparison  to  CPRM 

CPRM/CPPM  (Content  Protection  for  Recordable  Media  and  Pre-Recorded  Media)  is  a  technology  an¬ 
nounced  and  developed  by  4  companies  (known  as  the  4C’s):  IBM,  Intel,  Matsushita  and  Toshiba  [18].  It 
defines  a  method  for  protecting  content  on  physical  media  such  as  recordable  DVD,  DVD  Audio,  Secure 
Digital  Memory  Card  and  Secure  CompactFlash.  A  licensing  Entity  (the  Center)  provides  a  unique  set  of 
secret  device  keys  to  be  included  in  each  device  at  manufacturing  time.  The  licensing  Entity  also  provides 
a  Media  Key  Block  (MKB)  to  be  placed  on  each  compliant  media  (for  example,  on  the  DVD).  The  MKB  is 
essentially  the  Header  of  the  ciphertext  which  encrypts  the  session  key.  It  is  assumed  that  this  header  resides 
on  a  write-once  area  on  the  media,  e.g.  a  Pre-embossed  lead-in  area  on  the  recordable  DVD.  When  the 
compliant  media  is  placed  in  a  player/recorder  device,  it  computes  the  session  key  from  the  Header  (MKB) 
using  its  secret  keys;  the  content  is  then  encrypted/decrypted  using  this  session  key. 

The  algorithm  employed  by  CPRM  is  essentially  a  Subset-Cover  scheme.  Consider  a  table  with  A  rows 
and  C  columns.  Every  device  (receiver)  is  viewed  as  a  collection  of  C  entries  from  the  table,  exactly  one 
from  each  column,  that  is  u  =  [uj., . .  ■ ,  «c]  where  «,  6  {0, 1, . . . ,  A  —  1).  The  collection  of  subsets 
Si,...,Sw  defined  by  this  algorithm  correspond  to  subsets  of  receivers  that  share  the  same  entry  at  a  given 
column,  namely  Sr,i  contains  all  receivers  u  =  [ui, . . . ,  uc]  such  that  m  —  r.  For  every  0  <  i  <  A  - 1  and 
1  <  3  <  C  the  scheme  associates  a  key  denoted  by  Ltj.  The  private  information  Iu  that  is  provided  to  a 
device  u  =  [t»i, . . . ,  uc]  consists  of  C  keys  Luuli  -kti2)2>  ■  *  • }  ^uciC' 

For  a  given  set  K  of  revoked  devices,  the  method  partitions  jV\ftas  follows:  Si}j  is  in  the  cover  iff 
Sitj  fl  ft  =  0.  While  this  partition  guarantees  that  a  revoked  device  is  never  covered,  there  is  a  low  proba¬ 
bility  that  a  non-revoked  device  u  $  ft  will  not  be  covered  as  well  and  therefore  become  non-functional^ . 

The  CPRM  method  is  a  Subset-Cover  method  with  two  exceptions:  (1)  the  subsets  in  a  cover  are  not 
necessarily  disjoint  and  (2)  the  cover  is  not  always  perfect  as  a  non-revoked  device  may  be  uncovered.  Note 
that  the  CPRM  method  is  not  r-flexible\  the  probability  that  a  non-revoked  device  is  uncovered  grows  with 
r,  hence  in  order  to  keep  it  small  enough  the  number  of  revocations  must  be  bounded  by  A. 

For  the  sake  of  comparing  the  performance  of  CPRM  with  the  two  methods  suggested  in  this  paper, 
assume  that  C  ~  logiV  and  A  =  r.  Then,  the  message  is  composed  of  rlogiV  encryptions,  the  storage  at 
the  receiver  consists  of  logiV  keys  and  the  computation  at  the  receiver  requires  a  single  decryption.  These 
bounds  are  similar  to  the  Complete  Subtree  method;  however,  unlike  CPRM,  the  Complete  Subtree  method 
is  r-flexible  and  achieves  perfect  coverage.  The  advantage  of  the  Subset  Difference  Method  is  much  more 
substantial:  in  addition  to  the  above,  the  message  consists  of  1.25r  encryptions  on  average,  or  of  at  most 
2r  —  1  encryptions,  rather  than  r  log  N. 

7This  is  similar  to  the  scenario  considered  in  [27] 
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For  example,  in  DVD  Audio,  the  amount  of  storage  that  is  dedicated  for  its  MKB  (the  header)  is  3  MB. 
This  constraints  the  maximum  allowed  message  length.  Under  a  certain  choice  of  parameters,  such  as  the 
total  number  of  manufactured  devices  and  the  number  of  distinct  manufacturers,  with  the  current  CPRM 
algorithm  the  system  can  revoke  up  to  about  10,000  devices.  In  contrast,  for  the  same  set  of  parameters  and 
the  same  3MB  constraint,  a  Subset-Difference  algorithm  achieves  up  to  250,00  (!)  revocations,  a  factor  of  25 
improvement  over  the  currently  used  method.  This  major  improvement  is  partly  due  to  fact  that  hierarchical 
revocation  can  be  done  very  effectively,  a  property  that  the  current  CPRM  algorithm  does  not  have. 


5  Tracing  Traitors 

It  is  highly  desirable  that  a  revocation  mechanism  could  work  in  tandem  with  a  tracing  mechanism  to  yield 
a  trace  and  revoke  scheme.  We  show  a  tracing  method  that  works  for  many  schemes  in  the  subset-cover 
framework.  The  method  is  quite  efficient.  The  goal  of  a  tracing  algorithm  is  to  find  the  identities  of  those 
that  contributed  their  keys  to  an  illicit  decryption  box  and  revoke  them;  short  of  identifying  them  we  should 
render  the  box  useless  by  finding  a  “pattern”  that  does  not  allow  decryption  using  the  box,  but  still  allows 
broadcasting  to  the  legitimate  users.  Note  that  this  is  a  slight  relaxation  of  the  requirement  of  a  tracing 
mechanism,  say  in  [34]  (which  requires  an  identification  of  the  traitor’s  identity)  and  in  particular  it  lacks 
self  enforcement  [20].  However  as  a  mechanism  that  works  in  conjunction  with  the  revocation  scheme  it  is 
a  powerful  tool  to  combat  piracy. 

The  model 

Suppose  that  we  have  found  an  illegal  decryption-box  (decoder,  or  clone)  which  contains  the  keys  associated 
with  at  most  t  receivers  ui,...  ,ut  known  as  the  “traitors”. 

We  are  interested  in  “black-box”  tracing,  i.e.  one  that  does  not  take  the  decoder  apart  but  by  providing  it 
with  an  encrypted  message  and  observing  its  output  (the  decrypted  message)  tries  to  figure  out  who  leaked 
the  keys.  A  pirate  decoder  is  of  interest  if  it  correctly  decodes  with  probability  p  which  is  at  least  some 
threshold  q,  say  q  >  0.5.  We  assume  that  the  box  has  a  “reset  button”,  i.e.  that  its  internal  state  may  be 
retrieved  to  some  initial  configuration.  In  particular  this  excludes  a  “locking”  strategy  on  the  part  of  the 
decoder  which  says  that  in  case  it  detects  that  it  is  under  test,  it  should  refuse  to  decode  further.  Clearly 
software-based  systems  can  be  simulated  and  therefore  have  the  reset  property. 

The  result  of  a  tracing  algorithm  is  either  a  subset  consisting  of  traitors  or  a  partition  into  subsets  that 
renders  the  box  useless  i.e.  given  an  encryption  with  the  given  partition  it  decrypts  with  probability  smaller 
than  the  threshold  q  while  all  good  users  can  still  decrypt. 

In  particular,  a  “subsets  based”  tracing  algorithm  devises  a  sequence  of  queries  which,  given  a  black-box 
that  decodes  with  probability  above  the  threshold  q,  produces  the  results  mentioned  above.  It  is  based  on 
constructing  useful  sets  of  revoked  devices  1Z  which  will  finally  allow  the  detection  of  the  receiver’s  identity 
or  the  configuration  that  makes  the  decoder  useless.  A  tracing  algorithm  is  evaluated  based  on  (i)  the  level 
of  performance  downgrade  it  imposes  on  the  revocation  scheme  (ii)  number  of  queries  needed. 

5.1  The  Tracing  Algorithm 

Subset  tracing:  An  important  procedure  in  our  tracing  mechanism  is  one  that  given  a  partition  S  = 
SjuSi2,...Sim  and  an  illegal  box  outputs  one  of  two  possible  outputs:  either  (1)  that  the  box  cannot 
decrypt  with  probability  greater  than  the  threshold  when  the  encryption  is  done  with  partition  S  or  (ii)  Finds 
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Figure  4:  Bifurcating  a  Subset  Difference  set  Sij,  depicted  in  the  left.  The  black  triangle  indicates  the 
excluded  subtree.  L  and  R  are  the  left  and  the  right  children  of  v*.  The  resulting  sets  SLj  and  are 
depicted  to  the  right. 

a  subset  5*  such  that  5*  contains  a  traitor.  Such  a  procedure  is  called  subset  tracing.  We  describe  it  in 
detail  in  Section  5.1.1. 

Bifurcation  property:  Given  a  subset-tracing  procedure,  we  describe  a  tracing  strategy  that  works  for 
many  Subset-Cover  revocation  schemes.  The  property  that  the  revocation  algorithm  should  satisfy  is  that 
for  any  subset  Si,  1  <  i  <  w,  it  is  possible  to  partition  Si  into  two  (or  constant)  roughly  equal  sets,  i.e.  that 
there  exists  l  <  iui2  <  w  such  that  Si  =  5*  U  Sh  and  15* |  is  roughly  the  same  as  \Si2\.  For  a  Subset 
Cover  scheme,  let  the  bifurcation  value  be  the  relative  size  of  the  largest  subset  in  such  a  split. 

Both  the  Complete  Subtree  and  the  Subtree  Difference  methods  satisfy  this  requirement:  in  the  case  of 
the  Complete  Subtree  Method  each  subset,  which  is  a  complete  subtree,  can  be  split  into  exactly  two  equal 
parts,  corresponding  to  the  left  and  right  subtrees.  Therefore  the  bifurcation  value  is  1/2.  As  for  the  Subtree 
Difference  Method,  Each  subset  S{}j  can  be  split  into  two  subsets  each  containing  between  one  third  and  two 
thirds  of  the  elements.  Here,  again,  this  is  done  using  the  left  and  right  subtrees  of  node  i.  See  Figure  4.  The 
only  exception  is  when  i  is  a  parent  of  j,  in  which  case  the  subset  is  the  complete  subtree  rooted  at  the  other 
child;  such  subsets  can  be  perfectly  split.  The  worst  case  of  (1/3, 2/3)  occurs  when  i  is  the  grandparent  of 
j.  Therefore  the  bifurcation  value  is  2/3. 

The  Tracing  Algorithm:  We  now  describe  the  general  tracing  algorithm,  assuming  that  we  have  a  good 
subset  tracing  procedure.  The  algorithm  maintains  a  partition  5*,  S{2, . . .  Sim.  At  each  phase  one  of  the 
subsets  is  partitioned,  and  the  goal  is  to  partition  a  subset  only  if  it  contains  a  traitor. 

Each  phase  initially  applies  the  subset-tracing  procedure  with  the  current  partition  <5  —  Sit ,  Sj2, . . .  Sim. 
If  the  procedure  outputs  that  the  box  cannot  decrypt  with  S  then  we  are  done,  in  the  sense  that  we  have 
found  a  way  to  disable  the  box  without  hurting  any  legitimate  user.  Otherwise,  let  S:j  be  the  set  output  by 
the  procedure,  namely  Si}  contains  the  a  traitor. 

If  Si-  contains  only  one  possible  candidate  -  it  must  be  a  traitor  and  we  permanently  revoke  this  user; 
this  doesn’t  hurt  a  legitimate  user.  Otherwise  we  split  into  two  roughly  equal  subsets  and  continue  with 
the  new  partitioning.  The  existence  of  such  a  split  is  assured  by  the  bifurcation  property. 

Analysis:  Since  a  partition  can  occur  only  in  a  subset  that  has  a  traitor  and  contains  more  than  one  element, 
it  follows  that  the  number  of  iterations  can  be  at  most  t  log0  N,  where  a  is  the  inverse  of  the  bifurcation 
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value  (a  more  refined  expression  is  t  (loga  N  -  log2t)y  the  number  of  edges  in  a  binary  tree  with  t  leaves  and 
depth  loga  N.) 

5.1.1  The  Subset  Tracing  Procedure 

The  Subset  Tracing  procedure  first  tests  whether  the  box  decodes  with  sufficiently  high  probability  (say 
greater  than  0.5)  when  the  partition  is  S  =  Sit ,  Si21 . . .  Sim .  If  not,  then  it  concludes  (and  outputs)  that  the 
box  cannot  decrypt  with  S.  Otherwise,  it  needs  to  find  a  subset  Sy  that  contains  a  traitor. 

Let  pj  be  the  probability  that  the  box  decodes  the  ciphertext 

{[*1,  *2,  •  •  •  (RK),ELi2  (Rk),  ELi.  (Rk),  ELij+1  ( K ), . . . ,  ELim  (AT)],  FK{M)) 

where  RK  is  a  random  string  of  the  same  length  as  the  key  K.  That  is,  p,  is  the  probability  of  decoding 
when  the  first  j  subsets  are  noisy  and  the  remaining  subsets  encrypt  the  correct  key.  Note  that  p0  =  p  and 
pm  =  0,  hence  there  must  be  some  0  <  j  <  m  for  which  |pj_i  -  pj\  > 

Claim  7  Let  e  be  an  upper  bound  on  the  sum  of  the  probabilities  of  breaking  the  encryption  scheme  E  and 
key  assignment  method.  If pj-\  is  different  from  pj  by  more  than  e,  then  the  set  Sy  must  contain  a  traitor. 

Proof:  From  the  box’s  point  of  view,  a  ciphertext  that  contains  j  -  1  noisy  subsets  is  different  from  a  ci¬ 
phertext  that  contains  j  noisy  subsets  only  if  the  box  is  able  to  distinguish  between  EL. .  (if)  and  ELi  (ffo). 
Since  this  cannot  be  due  to  breaking  the  encryption  scheme  or  the  key  assignment  method  alone,  it  fellows 
that  the  box  must  contain  Ly .  □ 

We  now  describe  a  binary-search-like  method  that  efficiently  finds  a  pair  of  values  Pj,Pj-i  among 
PoT  •  •  •  ,Pm  satisfying  | pj-i  -pj\>  £.  Starting  with  the  entire  interval  [1,  m],  the  search  is  repeatedly  nar¬ 
rowed  down  to  an  arbitrary  interval  [a,  6],  At  each  stage,  the  middle  value  pa±b  is  computed  (approximately) 
and  the  interval  is  further  halved  either  to  the  left  half  or  to  the  right  half,  depending  on  difference  between 
Pa±6  and  the  endpoint  values  pa  and  p ^  of  the  interval  and  favoring  the  interval  with  the  larger  difference. 
The  method  is  outlined  below;  it  outputs  the  index  j . 


SubsetTracing(q,  6,  pQ ,  pb) 

If  {a  ==6-1) 
return  6 
Else 

c=r^i 

Find  pc 

If  |Pc  “Pol  >  bf>  —  Pa | 
SubsetTracing(a,  c,  pa ,  pc) 
Else 

SubsetTracing(c,  6,  pc ,  pb ) 


Efficiency:  Let  the  probability  of  error  be  e  and  the  range  error  be  8 .  Subset  tracing  is  comprised  of  log  m 
steps.  At  each  step  it  should  decide  with  probability  at  least  1  -  e  the  following: 

•  If  SfS!  >  +  <5).  decide  ”|Pc  -  pa|  >  |p»  -  Pel” 
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•  If  <  K1  -  5)>  decide  ”1  Pc  -  Pol  <  |Pfr  -  Pel” 


•  Otherwise,  any  decision  is  acceptable. 

In  order  to  distinguish  these  two  cases  apply  Claim  8  below.  Since  the  Claim  is  applied  log  m  times,  choose 
6  =  At  each  step  with  probability  at  least  1  —  e  the  interval  \pb  —  pa\  shrinks  by  at  least  a  factor  of 
|(1  —  £),  so  at  the  ith  step  the  interval  length  is  (with  probability  at  least  i  •  e)  larger  than  (|(1  -  <5))1;  hence 
the  smallest  possible  interval  when  i  —  $  —  is  of  length  A  >  with  probability  at  least  e  log  m. 
It  follows  that  a  subset  tracing  procedure  that  works  with  success  probability  of  (1  —  £  log  m)  requires  at 
most  0(m2  log  ~  log3  m)  ciphertext  queries  to  the  decoding  box  over  the  entire  procedure.  Note  that  a  total 
probability  of  success  bounded  away  from  zero  is  acceptable,  since  it  is  possible  to  verify  that  the  resulting 
Pj-uPj  differ,  and  hence  e  can  be  0(1/  log  m). 

Claim  8  Letpa,pb  be  the  two  probabilities  at  the  end-points  of  an  interval  [a,  6]  such  that  \pa  -  p&  |  >  A, 
and  let  Xc  be  a  random  variable  such  that  Prob[Xc  =  1]  =  pc  where  pc  is  unknown.  We  would  like  to 
sample  the  decoding  box  and  decide  “\pc  —  pa\  >  \Pb~Pc\  ”  or  *1  Pc  —  Pa\  <  \Pb~  Pc\  ”  according  to  the 
definition  given  above.  The  number  of  samples  (i.e.  ciphertext  queries)  required  to  reach  this  conclusion 
with  error  at  most  e  is  O  (log ( ~ ) /  ( A2  82 ) ) . 

Proof:  Let  Xa ,  Xb  and  Xc  be  {0, 1}  variables  satisfying  P[Xa  =  1]  =  pa,  P[X b  =  1  ]  -  Pb  and  P[XC  = 
1]  —  pc.  The  variant  of  Chemoff  bounds  described  in  [1],  Corollary  A.7  [p.  236],  states  that  for  a  sequence  of 
mutually  independent  random  variables  Yi, . . . ,  Yn  satisfying  P[Y*  =  1  -p]=p  and  P[Yi  =  - p ]  =  1  -  p 
it  holds  that  P[|  >  t]  <  2e“2*2/n  for  t  >  0.  Suppose  we  want  to  estimate  pa  by  sampling  na 

times  from  the  distribution  of  Xa.  Let  p'a  be  estimation  that  results  from  this  sampling.  Applying  the 
above  Chemoff ’s  bound  we  can  conclude  that  P[\pfa  —  pa  I  >  t]  <  2e~2n^+Pa^.  Hence,  by  choosing 

na  =  2(tTpa)7’  esdmated  p'a  obtained  from  sampling  na  times  satisfies  P[\p'a  -  pa\  >  t]  <  £.  Clearly, 

by  sampling  n  =  >  na  times  the  e-bounded  error  is  also  achieved.  Analogously,  this  analysis  holds  for 

the  process  of  sampling  from  Xb  and  Xc,  where  pb  and  p'c  are  the  estimations  that  result  from  sampling  the 
distributions  Xb  and  Xc. 

In  order  to  decide  whether  “| pc  -  pa\  >  \pb  ~  Pel”  or  “I Pc  ~  Pal  <  |P6  -  Pel”- 

i  2 

•  Sample  n  —  dines  from  each  of  the  distributions  Xa,  Xb  and  Xc  and  compute  Pa,p&,Pc,  the 

estimations  for  pa ,  pi, ,  pc  respectively. 

•  If  >  ^5^  then  decide  “|pc  -  Pol  >  |P6  -  Pel” 

•  If  Pc  <  then  decide  “|pc  -  p0|  <  |p&  -  Pel” 

The  number  of  samples  conducted  by  this  procedure  is  3 n  =  0(log( i)/(A252)).  We  now  have  to 
show  that  this  decision  is  in  accordance  with  the  definition  above.  Note  that  the  Chemoff  bound  implies 
that  with  probability  1-ewe  have  (i)  p'a  £  (pa  -  ^-,Pa  +  x)>  (“)  Pb  e  (Pf>  ~  +  x)  and  (“0 

p'c€(Pc-f,Pe+f). 

If  |p?-~Ptt|  >  |(1  +  S)  then  by  substituting  A  <  Pb  ~Pa  we  get  that  pc  >  Combining 

this  with  the  above,  p'c  >  pc  -  ^  >  Pt  tjP*1  +  ^  so  the  correct  decision  is  reached.  Similarly,  if 

<1(1- <5).  D 
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Noisy  binary  search:  A  more  sophisticated  procedure  is  to  treat  the  Subset-Tracing  procedure  as  noisy 
binary  search,  as  in  [22].  They  showed  that  in  a  model  where  each  answer  is  correct  with  some  fixed 
probability  (say  greater  than  2/3)  that  is  independent  of  history  it  is  possible  to  perform  a  binary  search 
in  0( log  N)  queries.  Each  step  might  require  backtracking;  in  the  subset-tracing  scenario,  the  procedure 
backtracks  if  the  condition  \pa  —  pf, |  >  (|)*  does  not  hold  at  the  step  (which  indicates  an  error  in  an 
earlier  decision).  Estimating  the  probability  values  within  an  accuracy  of  —  while  guaranteeing  a  constant 
probability  of  error  requires  only  0{m2)  ciphertexts  queries.  This  effectively  means  that  we  can  fix  S  and 
e  to  be  constants  (independent  of  m).  Therefore,  we  can  perform  the  noisy  binary  search  procedure  with 
0(m2  log  m)  queries. 

5.2  Improving  the  Tracing  Algorithm 

The  basic  traitors  tracing  algorithm  described  above  requires  t\og(N/t)  iterations.  Furthermore,  since  at 
each  iteration  the  number  of  subsets  in  the  partition  increases  by  one,  tracing  t  traitors  may  result  with 
up  to  t\og{N/t)  subsets  and  hence  in  messages  of  length  tlog(N/t).  This  bound  holds  for  any  Subset- 
Cover  method  satisfying  the  Bijurcation  property ,  and  both  the  Complete  Subtree  and  the  Subset  Difference 
methods  satisfy  this  property.  What  is  the  bound  on  the  number  of  traitors  that  the  algorithm  can  trace? 

Recall  that  the  Complete  Subtree  method  requires  a  message  length  of  rlog(iVr/r)  for  r  revocations, 
hence  the  tracing  algorithm  can  trace  up  to  r  traitors  if  it  uses  the  Complete  Subtree  method.  However,  since 
the  message  length  of  the  Subset  Difference  method  is  at  most  2r  -  1,  only  traitors  can  be  traced 

if  Subset  Difference  is  used.  We  now  describe  an  improvement  on  the  basic  tracing  algorithm  that  reduces 
the  number  of  subsets  in  the  partition  to  5*  -  1  for  the  Subset  Difference  method  (although  the  number  of 
iterations  remains  t\og(N/t)).  With  this  improvement  the  algorithm  can  trace  up  to  r/5  traitors. 

Note  that  among  the  t  log  N/t  subsets  generated  by  the  basic  tracing  algorithm,  only  t  actually  contain  a 
traitor.  The  idea  is  to  repeatedly  merge  those  subsets  which  are  not  known  to  contain  a  traitor.8  Specifically, 
we  maintain  at  each  iteration  a  frontier  of  at  most  2 1  subsets  plus  3t  -  1  additional  subsets.  In  the  following 
iteration  a  subset  that  contains  a  traitor  is  further  partitioned;  as  a  result,  a  new  frontier  is  defined  and  the 
remaining  subsets  are  re-grouped. 

Frontier  subsets:  Let  5,, ,  Si2 , . . .  Sim  be  the  partition  at  the  current  iteration.  A  pair  of  subsets  (S;^,  S{j2 ) 
is  said  to  be  in  the  frontier  if  S{jl  and  S{j2  resulted  from  a  split-up  of  a  single  subset  at  an  earlier  iteration. 
Also  neither  (Sin  nor  Sij2)  was  singled  out  by  the  subset  tracing  procedure  so  far.  This  definition  implies 
that  the  frontier  is  composed  of  k  disjoint  pairs  of  buddy  subsets.  Since  buddy-subsets  are  disjoint  and  since 
each  pair  originated  from  a  single  subset  that  contained  a  traitor  (and  therefore  has  been  split)  k  <  t. 

We  can  now  describe  the  improved  tracing  algorithm  which  proceeds  in  iterations.  Every  iteration  starts 
with  a  partition  S  =  Sit,  Si2 , . . .  Sim .  Denote  by  F  C  S  the  frontier  of  S.  An  iteration  consists  of  the 
following  steps,  by  the  end  of  which  a  new  partition  S'  and  a  new  frontier  F'  is  defined. 

•  As  before,  use  the  Subset  Tracing  procedure  to  find  a  subset  S that  contains  a  traitor.  If  the  tracing 

procedure  outputs  that  the  box  can  not  decrypt  with  S  then  we  are  done.  Otherwise,  split  S*.  into  Six 
and  Sij2.  3  3 

•  F1  =  F  U  Sy,  U  5y2  (5yj  and  S,;2  are  now  in  the  frontier).  Furthermore,  if  Sy  was  in  the  frontier  F 
and  Sik  was  its  buddy-subset  in  F  then  F1  =  F'\  Sik  (remove  Sik  from  the  frontier). 

8This  idea  is  similar  to  the  second  scheme  of  [24],  Section  3.3.  However,  in  [24]  the  merge  is  straightforward  as  their  model 
allows  any  subset.  In  our  model  only  members  from  the  Subset  Difference  are  allowed,  hence  a  merge  which  produces  subsets  of 
this  particular  type  is  non-trivial. 
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•  Compute  a  cover  C  for  all  receivers  that  are  not  covered  by  F'.  Define  the  new  partition  S'  as  the 
union  of  C  and  F'. 

To  see  that  the  process  described  above  converges,  observe  that  at  each  iteration  the  number  of  new 
small  frontier  sets  always  increases  by  at  least  one.  More  precisely,  at  the  end  of  each  iteration  construct 
a  vector  of  length  N  describing  how  many  sets  of  size  i,  1  <  i  <  N,  constitute  the  frontier.  It  is  easy  to 
see  that  these  vectors  are  lexicographically  increasing.  The  process  must  stop  when  or  before  all  sets  in  the 
frontier  are  singletons. 

By  definition,  the  number  of  subsets  in  a  frontier  can  be  at  most  2 1.  Furthermore,  they  are  paired  into  at 
most  t  disjoint  buddy  subsets.  As  for  non-frontier  subsets  (C),  Lemma  4  shows  that  covering  the  remaining 
elements  can  be  done  by  at  most  \F\  <  3t  -  1  subsets  (note  that  we  apply  the  lemma  so  as  to  cover  all 
elements  that  are  not  covered  by  the  buddy  subsets,  and  there  are  at  most  t  of  them).  Hence  the  partition  at 
each  iteration  is  composed  of  at  most  5t  —  1  subsets. 

5.3  Tracing  Traitors  from  Many  Boxes 

As  new  illegal  decoding  boxes,  decoding  clones  and  hacked  keys  are  continuously  being  introduced  during 
the  lifetime  of  the  system,  a  revocation  strategy  needs  to  be  adopted  in  response.  This  revocation  strategy 
is  computed  by  first  revoking  the  identities  (leaves)  of  all  the  receivers  that  need  to  be  excluded,  resulting 
in  some  partition  So-  Furthermore,  to  trace  traitors  from  possibly  more  than  one  black  box  and  make  all 
of  these  boxes  non-decoding,  the  tracing  algorithm  needs  to  be  run  in  parallel  on  all  boxes  by  providing 
all  boxes  with  the  same  input.  The  initial  input  is  the  partition  S0  that  results  from  direct  revocation  of 
all  known  identities.  As  the  algorithm  proceeds,  when  the  first  box  detects  a  traitor  in  one  of  the  sets  it 
re-partitions  accordingly  and  the  new  partition  is  now  input  to  all  boxes  simultaneously.  The  output  of  this 
simultaneous  algorithm  is  a  partition  (or  ’’revocation  strategy”)  that  renders  all  revoked  receivers  and  illegal 
black  boxes  invalid. 

6  Security  of  the  Framework 

In  this  section  we  discuss  the  security  of  a  Subset-Cover  algorithm.  Intuitively,  we  identify  a  critical  property 
that  is  required  from  the  key-assignment  method  in  order  to  provide  a  secure  Subset-Cover  algorithm.  We 
say  that  a  subset-cover  algorithm  satisfies  the  ”key-indistinguishability”  property  if  for  every  subset  Si  its 
key  Li  is  indistinguishable  from  a  random  key  given  all  the  information  of  all  users  that  are  not  in  Si- 
We  then  proceed  to  show  that  any  subset-cover  algorithm  that  satisfies  the  key-indistinguishability  property 
provides  a  secure  encryption  of  the  message. 

We  must  specify  what  is  a  secure  revocation  scheme,  i.e.  describe  the  adversary’s  power  and  what  is 
considered  a  successful  break.  We  provide  a  sufficient  condition  for  a  Subset-Cover  revocation  scheme  A 
to  be  secure.  We  start  by  stating  the  assumptions  on  the  security  of  the  encryption  schemes  E  and  F .  All 
security  definitions  given  below  refer  to  an  adversary  whose  challenge  is  of  the  form:  distinguish  between 
two  cases  ‘i’  and  ‘ii’. 

6.1  Assumptions  on  the  Primitives 

Recall  that  the  scheme  employs  two  cryptographic  primitives  Fk  and  El .  The  security  requirements  of 
these  two  methods  are  different,  since  Fk  uses  short  lived  keys  whereas  El  uses  long-lived  ones.  In  both 
cases  we  phrase  the  requirements  in  terms  of  a  the  probability  of  success  in  distinguishing  an  encryption 
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of  the  true  message  from  an  encryption  of  a  random  message.  It  is  well  known  that  such  formulation  is 
equivalent  to  semantic  security  (that  anything  that  can  be  computed  about  the  message  given  the  ciphertext 
is  computable  without  it),  see  [29, 28, 4]9. 

The  method  Fk  for  encrypting  the  body  of  the  message  should  obey  the  following  property:  consider 
any  feasible  adversary  B  that  chooses  a  message  M  and  receives  for  a  randomly  chosen  K  6  {0, 1}/  one 
of  the  following  (i)  Fk(M)  (ii)  Fk(Rm)  for  a  random  message  RM  of  length  \M\.  The  probability  that  B 
distinguishes  the  two  cases  is  negligible  and  we  denote  the  bound  by  ei,  i.e. 

|Pr [B  outputs  T  \Fk{M)\  ~  Pr [B  outputs  ‘i*  \Fk{Rm)]\  <  e\. 

Note  that  implementing  Fk  by  a  pseudo-random  generator  (stream-cipher)  where  K  acts  as  the  seed 
and  whose  output  is  Xored  bit-by  bit  with  the  message  satisfies  this  security  requirement. 

The  long  term  encryption  method  El  should  withstand  a  more  severe  attack,  in  the  following  sense: 
consider  any  feasible  adversary  B  that  for  a  random  key  L  gets  to  adaptively  choose  polynomially  many 
inputs  and  examine  El's  encryption  and  similarly  provide  ciphertexts  and  examine  EL  ’s  decryption.  Then  B 
is  faced  with  the  following  challenge:  for  a  random  plaintext  x  (which  is  provided  in  the  clear)  it  receives  one 
of  (i)  El{x)  or  (ii)  El{Rx)  where  Rx  is  a  random  string  of  length  |rr|.  The  probability  that  B  distinguishes 
the  two  cases  is  negligible  and  we  denote  the  bound  by  £2,  i.e. 

\Pt[B  outputs  T  \El{x)]  -  Pr[#  outputs  T|2?£,(.RX)]|  <  £2. 

Note  that  the  above  specification  indicates  that  E  should  withstand  a  chosen-ciphertext  attack  in  the  pre¬ 
processing  mode  in  the  terminology  of  [1 9]  or  CCA-I  in  [3].  Possible  implementation  of  El  can  be  done  via 
pseudo-random  permutations  (which  model  block-ciphers).  See  more  details  on  the  efficient  implementation 
of  F  and  E  in  Section  4. 1 . 

Key  Assignment:  Another  critical  cryptographic  operation  performed  in  the  system  is  the  key  assignment 
method,  i.e.  how  a  user  u  derives  the  keys  L{  for  the  sets  Si  such  that  u  €  Si.  We  now  identify  an  important 
property  the  key  assignment  method  in  a  subset-cover  algorithm  should  possess  that  will  turn  out  to  be 
sufficient  to  provide  security  for  the  scheme: 

Definition  9  Let  A  be  a  Subset-Cover  revocation  algorithm  that  defines  a  collection  of  subsets  Si, ,  Sw. 
Consider  a  feasible  adversary  B  that 

1 .  Selects  i,  1  <  i  <  w 

2.  Receives  the  Iu ’s  (secret  information  that  u  receives)  for  all  u  €  Jf  \S{ 

We  say  that  A  satisfies  the  key-indistinguishability  property  if  the  probability  that  B  distinguishes  Li  from  a 
random  key  Rl{  of  similar  length  is  negligible  and  we  denote  this  by  £3,  i.e. 

|Pr[B  outputs  V  | Li]  —  Pr[B  outputs  V  |jRx,.]|  <  £3. 

Note  that  all  “information  theoretic”  key  assignment  schemes,  namely  schemes  in  which  the  keys  to  all 
the  subsets  are  chosen  independently,  satisfy  Definition  9  with  £3  =  0. 

The  next  lemma  is  a  consequence  of  the  key-indistinguishability  property  and  will  be  used  in  the  proof 
of  Theorem  12,  the  Security  Theorem. 

fiOne  actually  has  to  repeat  such  an  equivalence  proof  for  the  adversaries  in  question. 
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Lemma  10  For  any  1  <  i  <  w  let  Sj, ,  Si2  •  .  • ,  Sit  be  all  the  subsets  that  are  contained  in  Si,  Let 
Li Lit  be  their  corresponding  keys.  For  any  adversary  B  that  selects  i,  1  <  i  <  w,  and  receives 
Iu  for  all  u  e  Jf\  Si,  ifB  attempts  to  distinguish  the  keys  Lix , . . . ,  Lit  from  random  keys  Rlh  ,...,  Rlh 
(of  similar  lengths)  then 

|Pr[Z?  outputs  ‘i’lLit,..., Lit] -Ft[B outputs  1^1  »•  •  •  ^RUt]\  ^  t  '  £3- 


Proof:  Let  us  rename  the  subsets  Si, , Si2 . . .  ,Sit  as  Si, S2, . . .  St  and  order  them  according  to  their  size, 
that  is  for  all  j  =  1,. ..  ,t,Sj  C  Si  and  \Si\  >  |S2|  >  ...  |5t|.  We  will  now  use  a  hybrid  argument: 
consider  an  “input  of  the  jth  type”  as  one  where  the  first  j  keys  are  the  true  keys  and  the  remaining  t  -  j 
keys  are  random  keys.  VI  <j<  t,  let  p,  be  the  probability  that  B  outputs  T  when  challenged  with  an  input 
of  the  jth  type,  namely 

Pj  =  Pr[£J outputs  ‘i*  \Li, ..., Lj, R[,j+1  Ru] 

Suppose  that  the  lemma  doesn’t  hold,  that  is  \pt  -p0\>t- £3.  Hence  there  must  be  some  j  for  which 
\Pj  ~  Pj- 1|  >  £3-  We  now  show  how  to  create  an  adversary  B'  that  can  distinguish  between  RLj  and  Lj 
with  probability  >  e3,  contradicting  the  key-indistinguishability  property.  The  actions  of  B'  result  from  a 
simulation  of  B: 

•  When  B  selects  Si,  B'  selects  the  subset  Sj  C  Si  from  the  above  discussion  (that  is,  the  j  for  which 
| Pj  -Pj_ i|  >  e3).  It  receives  Iu  for  all  u  €  M\Sj  and  hence  can  provide  B  with  Iu  for  all  u  6  N\Si. 

•  When  B'  is  given  a  challenge  X  and  needs  to  distinguish  whether  X  is  Rl,  or  Lj,  it  creates  a  challenge 
to  B  that  will  be  L\, . .  ■ , Lj, Rhj+, Rht  or  L\,.. . ,  Lj  i- Rl}, RLj+i ■  ■  ■  ,Rh ■  Note  that  due 
their  order  Si,...,  Sj- 1  <f  Sj-,  since  B'  received  Iu  for  all  u  6  U\Sj  it  knows  the  keys  Li, . . . ,  Lj- 1, 
while  RLj+1  ,...,RLt  are  chosen  at  random.  The  jth  string  in  the  challenge  is  simply  X  (the  one  B' 
received  as  a  challenge.)  B'  response  is  simply  B’s  answer  to  the  query. 

The  advantage  that  B'  has  in  distinguishing  between  RLj  and  Lj  is  exactly  the  advantage  B'  has  in  dis¬ 
tinguishing  between  Li,...,  Lj,  Rlj+i  >  •  •  •  >  Ru  an<^  L\,... ,  Lj-\,  Rl:  ,  Rl}+ 1 ,  •  -  •  1  Rl,,  which  is  by  as¬ 
sumption  larger  than  £3,  contradicting  the  key-indistinguishability  property.  a 

6.2  Security  Definition  of  a  Revocation  Scheme 

To  define  the  security  of  a  revocation  scheme  we  first  have  to  consider  the  power  of  the  adversary  in  this 
scenario  (and  make  pessimistic  assumption  on  its  ability).  The  adversary  can  pool  the  secret  information  of 
several  users,  and  it  may  have  some  influence  on  the  the  choice  of  messages  encrypted  in  this  scheme  (chosen 
plaintext).  Also  it  may  create  bogus  messages  and  see  how  legitimate  users  (that  will  not  be  revoked)  react. 
Finally  to  say  that  the  adversary  has  broken  the  scheme  means  that  when  the  users  who  have  provided  it  their 
secret  information  are  all  revoked  (otherwise  it  is  not  possible  to  protect  the  plaintext)  the  adversary  can  still 
learn  something  about  the  encrypted  message.  Here  we  define  “learn”  as  distinguishing  its  encryption  from 
random  (again  this  is  equivalent  to  semantic  security). 

Definition  11  consider  an  adversary  B  that  gets  to 
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1 .  Select  adaptively  a  set  ft  of  receivers  and  obtain  Iufor  all  u  Eft.  By  adaptively  we  mean  that  B  may 
select  messages  Mi,  M2  . . .  and  revocation  set  ft\,  ft,2, . . .  (the  revocation  sets  need  not  correspond 
to  the  actual  corrupted  users)  and  see  the  encryption  of  M(  when  the  revoked  set  is  ft{.  Also  B  can 
create  a  ciphertext  and  see  how  any  (non-corrupted)  user  decrypts  it.  It  then  asks  to  corrupt  a  receiver 
u  and  obtains  Iu.  This  step  is  repeated  \ft\  times  (for  any  u  E  ft). 

2.  Choose  a  message  M  as  the  challenge  plaintext  and  a  set  ft  of  revoked  users  that  must  include  all  the 
ones  it  corrupted  (but  may  contain  more). 

B  then  receives  an  encrypted  message  M'  with  a  revoked  set  ft.  It  has  to  guess  whether  Mf  =  M  or 
Mf  =  Rm  where  Rm  is  a  random  message  of  similar  length.  We  say  that  a  revocation  scheme  is  secure  if 
for  any  (probabilistic  polynomial  time)  adversary  B  as  above,  the  probability  that  B  distinguishes  between 
the  two  cases  is  negligible. 

6.3  The  Security  Theorem 

We  now  state  and  prove  the  main  security  theorem,  showing  that  the  key-indistinguishability  property  is 
sufficient  for  a  scheme  in  the  subset-cover  framework  to  be  secure  in  the  sense  of  Definition  11.  Precisely, 

Theorem  12  Let  Abe  a  Subset-Cover  revocation  algorithm  where  the  key  assignment  satisfies  the  key- 
indistinguishability  property  (Definition  9)  and  where  E  and  F  satisfy  the  above  requirements.  Then  A  is 
secure  in  the  sense  of  Definition  II  with  security  parameter  8  <e  1 4-  2mw(e2  +  4it7£3),  where  w  is  the  total 
number  of  subsets  in  the  scheme  and  m  is  the  maximum  size  of  a  cover. 

Proof:  Let  A  be  a  Subset-Cover  revocation  algorithm  with  the  key  indistinguishability  property.  Let  B  be 
an  adversary  that  behaves  according  to  Definition  1 1,  where  8  is  the  probability  that  B  distinguishes  between 
an  encryption  of  M  and  an  encryption  of  a  random  message  of  similar  length. 

Recall  that  the  adversary  adaptively  selects  a  set  of  receivers  ft  and  obtains  Iu  for  all  u  E  ft.  B  then 
selects  a  challenge  message  M.  Let  S  =  ,  Si2 , . . .  Sim  be  the  cover  of  Jsf  \  ft  defined  by  A.  As  a 

challenge,  B  then  receives  an  encrypted  message  and  is  asked  to  guess  whether  it  encrypts  M  or  a  random 
message  Rm  of  the  same  length  as  M.  We  consider  B' s  behavior  in  case  not  all  the  encryptions  are  proper. 
Let  a  “ciphertext  of  the  jth  type”  be  one  where  the  first  j  subsets  are  noisy  and  the  remaining  subsets  encode 
the  correct  key.  In  other  words  the  body  is  the  encryption  using  FK  and  the  header  is: 

[*ii *21  •  •  •  1  im,  ELii  (RlK),  El.2  {R2k),  . . . , El..  ( R jK),  E Lij+l  (AT), . . . ,  ELim  {K)) 

where  K  is  a  random  key  and  {iQj  are  random  strings  of  the  same  length  as  the  key  K .  Let  A  j  be  the 
advantage  that  for  a  ciphertext  of  the  jth  type  B  distinguishes  between  the  cases  where  FK(M)  or  FK(RM) 
are  the  body  of  the  message.  I.e. 

Aj  =  \Pt[B  outputs  T  | body  is  FK{M)}  -  Ft[B  outputs  T  |body  is  FK{RM)]\ , 
where  the  header  is  of  the  jth  type. 

The  assumption  that  B  can  break  the  revocation  system  implies  that  A0  =  8.  We  also  know  that 
Am  <  eu  the  upper  bound  on  the  probability  of  breaking  Fjc ,  since  in  ciphertexts  of  the  mth  type  the 
encryptions  Ei..  in  the  header  contain  no  information  on  the  key  K  used  for  the  body  so  K  looks  random 
to  B.  Hence  there  must  be  some  0  <  j  <  m  such  that 

m 
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For  this  j  it  must  be  the  case  that  for  either  M  or  RM  the  difference  in  the  probability  that  B  outputs  T 
between  the  case  when  the  header  is  of  the  jth  type  and  when  it  is  of  the  (j  -  l)th  type  (and  the  same 
message  is  in  the  body)  is  at  least  . 

A  ciphertext  of  the  (j  -  l)th  type  is  noticeably  different  from  a  ciphertext  of  the  jth  type  only  if  it 
is  possible  to  distinguish  between  El^(K)  and  {Rk)-  Therefore,  the  change  in  the  distinguishing 

advantage  |Aj_i  -  Aj|  >  can  be  used  to  either  break  the  encryption  EL  or  to  achieve  an  advantage 
in  distinguishing  the  keys.  We  will  now  show  how  B  can  be  used  to  construct  an  adversary  B!  that  either 
breaks  EL  or  breaks  the  key-indistinguishability  property,  as  extended  by  Lemma  10.  This  in  turn  is  used  to 
derive  bounds  on  S . 

Formally,  we  now  describe  an  adversary  B'  that  will  use  B  as  follows. 

•  &  picks  at  random  1  <  i  <  w  and  asks  to  obtain  Iu  for  all  u  g  Sn  this  is  a  guess  that  =  S{. 

•  Bf  receives  either  Lo,  L\, ...  ,Lt  or  Ri0 ,  Rlx  , . . . ,  RjLt  where  Lq  =  Li,  the  key  of  the  subset  Si,  and 
L\) . . .  ,i/t  are  defined  as  the  keys  in  Lemma  10.  It  attempts  to  distinguish  between  the  case  where 
the  input  corresponds  to  true  keys  and  the  case  where  the  input  consists  of  random  keys. 

•  Br  simulates  B  as  well  as  the  Center  that  generates  the  ciphertexts  and  uses  B's  output: 

-  When  the  Center  is  faced  with  the  need  to  encrypt  (or  decrypt)  using  the  key  of  subset  Sj  such 
that  Sj  £  Su  then  it  knows  at  least  one  u  €  Sj\  from  Iu  it  is  possible  to  obtain  Lj  and  encrypt 
appropriately.  If  Sj  C  Si  then  &  uses  the  key  that  was  provided  to  it  (either  Lj  or  R^). 

-  When  B  decides  to  corrupt  a  user  u,  if  u  £  Si,  then  B!  can  provide  it  with  Iu.  If  u  €  Si  then  the 
guess  that  ij  —  i  was  wrong  and  we  abort  the  simulation. 

-  When  the  Center  needs  to  generate  the  challenge  ciphertext  M  for  B,  B1  finds  a  cover  for  R,  the 
set  of  users  corrupted  by  B .  If  ij  ^  i,  then  the  guess  was  wrong  and  the  simulation  is  aborted. 
Otherwise  a  random  key  K  is  chosen  and  a  body  of  a  message  encrypted  with  K  is  generated 
where  the  encrypted  message  is  either  M  or  Rm  (depending  to  whom  the  difference  between 
A  j  and  Aj_i  is  due)  and  one  of  two  experiments  is  performed: 

Experiment  j :  Create  a  header  of  ciphertext  of  the  jth  type. 

Experiment  j  —  1:  Create  a  header  of  ciphertext  of  the  (j  -  l)th  type. 

Provide  as  challenge  to  B  the  created  header  and  body. 

•  If  the  simulation  was  aborted,  output  T  or  ‘ii’  at  random.  Otherwise  provide  B's  output. 

Denote  by  P3L  (and  P['1  resp.)  the  probability  that  in  experiment  j  (experiment  j  -  1)  in  case  the  input 
to  Bf  are  the  true  keys  the  simulated  B  outputs  T;  denote  by  P3R  (and  P3R  1  resp.)  the  probability  that  in 
experiment  j  (experiment  j  —  1)  in  case  the  input  to  Bf  are  random  keys  the  simulated  B  outputs  i  .  We 
claim  that  the  differences  between  all  these  4  probabilities  can  be  bounded: 

Claim  13  |P£-Pr1|>fe^ 

Proof:  In  case  the  input  to  B1  are  the  true  keys,  the  resulting  distribution  that  the  simulated  B  experiences  is 
what  it  would  experience  in  a  true  execution  (where  the  difference  between  a  j  ^  ciphertext  and  (j  —  1) 
ciphertext  are  at  least  ^L).  The  probability  that  the  guess  was  correct  is  1/w  and  this  is  independent  of  the 
action  of  B,  so  we  can  experience  a  difference  of  at  least  f^-  between  the  two  cases.  n 
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Claim  14  \PiR-P^l\<e2 


Proof:  Since  otherwise  we  can  use  Bf  to  attack  E :  whenever  there  is  a  need  to  use  the  key  corresponding 
to  the  set  Si,  ask  for  an  encryption  using  the  random  key.  Similarly  use  the  K  in  the  challenge  for  E  as  the 
one  in  the  challenge  of  B*.  □ 

Claim  15  |  P3r  -  P[  |  <  iu  *  e3  and  \Pj~l  -  P[~l\  <10-63 


Proof:  If  any  of  the  two  inequalities  does  not  hold,  then  we  can  use  Bf  as  an  adversary  for  Lemma  10  and 
contradict  the  safety  of  the  key  assignment  (we  know  that  t  <w).  □ 


From  these  three  claims  and  applying  the  inequality  \a  -  6|  -  \c  -  d\  <  \a  -  c\  +  \b  -  d\  we  can  conclude 
that 


S  -  e% 
2  wm 


-  62  <2w  -  63 


and  hence  the  overall  security  parameter  of  A  satisfies  5  <e  1  +  2 mw(e2  +  2 1063). 


□ 

Weaker  notions  of  security  It  is  interesting  to  deal  with  the  case  where  the  encryption  provided  by  F  is  not 
so  strong.  To  combat  copyright  piracy  it  may  not  make  sense  to  protect  a  specific  ciphertext  so  that  breaking 
it  is  very  expensive;  on  the  other  hand  we  do  want  to  protect  the  long  lived  keys  of  the  system.  The  security 
definition  (Definition  11)  can  easily  be  adapted  to  the  case  where  distinguishing  Fk{M)  from  Fk{Rm) 
cannot  be  done  in  some  time  T\  where  T\  is  not  too  large  (this  may  correspond  to  using  a  not  very  long 
key  K):  the  challenge  following  the  attack  is  to  distinguish  FK(M)  from  Fk(Rm)  it  time  less  than  T[  not 
much  smaller  than  T\ .  Essentially  the  same  statement  and  proof  of  security  as  Theorem  1 2  hold.  The  fact 
that  retrieving  K  does  not  have  to  be  intractable,  just  simply  expensive,  means  that  K  does  not  necessarily 
have  to  be  long;  see  discussion  on  the  implications  on  the  total  message  length  in  Section  4.1. 

It  is  also  possible  to  model  the  case  where  the  protection  that  Fk  provides  is  not  indistinguishability  (e.g. 
Fk  encrypts  only  parts  of  the  message  M  that  are  deemed  more  important).  In  this  case  we  should  argue  that 
the  header  does  not  provide  more  information  regarding  M  than  does  FK  (M).  More  precisely,  suppose  that 
M  is  a  distribution  on  messages  M  and  let  B  be  an  adversary  that  attacks  the  system  as  in  Definition  1 1  but 
is  given  as  a  challenge  a  valid  encryption  of  a  message  M  eR  M  and  attempts  to  compute  some  function  of 
M  (e.g.  M  defines  a  piece  of  music  and  the  function  is  to  map  it  to  sounds).  A  scheme  is  considered  secure 
if  for  any  M  and  B  there  is  a  B*  that  simply  receives  Fk  ( M )  without  the  header  and  (i)  performs  an  amount 
of  work  proportional  to  B  after  receiving  the  challenge  and  (ii)  whose  output  is  indistinguishable  from  B's 
output;  the  distinguisher  should  have  access  to  M.  Here  again  for  any  subset  cover  algorithm  where  E  and 
the  key  assignment  algorithm  satisfy  the  requirements  of  Section  6.1  the  resulting  scheme  will  satisfy  the 
relaxed  definition. 
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1.  INTRODUCTION 

The  networks  of  the  future  will  be  able  to  support  gigabit  bandwidths  for  large 
groups  of  users.  These  users  will  possess  various  qualities  of  service  options  and 
multimedia  applications  that  include  video,  voice,  and  data,  all  on  the  same  net¬ 
work  backbone.  The  desire  to  create  small  groups  of  users  all  interconnected  and 
capable  of  communicating  with  each  other,  but  who  are  securely  isolated  from  all 
other  users  on  the  network  is  being  expressed  strongly  in  a  variety  of  communities. 
Today  most  network  applications  are  based  upon  the  client-server  paradigm  and 
make  use  of  unicast  (peer-to-peer)  packet  delivery.  However,  many  emerging  appli¬ 
cations  (e.g.,  wargaming,  law  enforcement,  teleconferencing,  command  and  control 
conferencing,  disaster  relief,  and  distributed  computing,  collaborative  work,  etc.) 
are  based  upon  a  group  communication  model.  That  is,  they  require  packet  deliv¬ 
ery  from  one  or  more  authorized  sender(s)  to  a  variable  large  number  of  authorized 
receivers.  While  peer-to-peer  security  is  a  mature  and  well  developed  field,  secure 
group  communication  remains  relatively  unexplored.  Contrary  to  a  common  ini¬ 
tial  impression,  secure  group  communication  is  not  a  simple  extension  of  secure 
two-party  communication.  There  are  two  important  differences.  First,  protocol 
efficiency  is  of  greater  concern  due  to  the  number  of  participants  and  to  the  dis¬ 
tances  among  them.  The  second  difference  is  due  to  group  dynamics.  Two-party 
communication  can  be  viewed  as  a  discrete  phenomenon:  it  starts,  lasts  for  a  while 
and  normally  ends.  Group  communication  is  usually  more  complicated,  it  starts, 
the  group  mutates  (members  leave  and  join)  and  there  might  not  be  a  well-defined 
end.  This  complicates  security  services  among  which  group  key  management  (i.e., 
initializing  the  secure  group  with  a  common  net  key,  rekeying  the  group,  etc.)  is 
the  most  important. 

In  group  communications,  data  authentication  comes  in  two  different  kinds.  In 
one  case,  integrity  and  authentication  of  origin  with  respect  to  a  specific  individual 
sender  are  guaranteed.  We  then  speak  of  individual  authentication.  In  the  other 
case,  data  is  known  to  have  been  sent  and/or  modified  by  a  member  of  a  specified 
group  -  we  do  not  know  which  member,  and  actually  any  group  member  could  even 
have  modified  the  data  while  in  transit.  We  then  speak  of  group  authentication. 

Individual  authentication  is  especially  important  in  broadcast  applications.  In 
this  case,  there  is  usually  one  main  transmitter  and  a  possibly  large,  or  very  large, 
number  of  receivers.  This  makes  group  authentication  of  little  use  -  we  want  to 
know  that  the  broadcast  data  comes  from  the  designated  transmitter;  receivers 
cannot  be  trusted  and  usually  they  do  not  need  to  be  so  because  they  just  receive 
the  information. 

This  paper  addresses  individual  authentication.  A  general  authentication  pro¬ 
tocol  will  be  obtained,  that  is  secure  even  when  attackers  may  adaptively  choose 
authenticated  data  streams.  For  an  interactive  scenario,  that  is  typical  of  multicast 
conferencing,  the  proposed  solution  is  substantially  more  efficient  than  in  previous 
approaches. 

This  paper  is  organised  as  follows:  section  2  reviews  the  works  on  individual 
authentication  in  group  communication,  section  3  presents  the  chained  stream  au¬ 
thentication  (CSA)  protocol  and  discusses  the  concept  of  security  against  continu¬ 
ations.  The  two  following  sections  present  two  variants  of  CSA,  one  for  interacting 
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parties  (section  4)  and  one  for  broadcast  communication  (section  5).  Then,  sec¬ 
tion  6  discusses  the  implementation  of  the  presented  protocol  in  a  real  application, 
and  section  7  draws  some  conclusions. 

2.  RELATED  WORK 

Individual  authentication  is  not  commonly  found  in  broadcast  and  multicast,  mainly 
because  it  is  technically  difficult  or  inefficient.  There  are  two  obvious  solutions. 
First,  digital  signatures  could  be  used  for  each  block  of  data  being  broadcast.  This 
works,  and  is  secure,  but  is  inefficient.  Especially  for  receivers  that  may  be  equipped 
only  with  simple  hardware,  or  even  with  set-top-box  solutions,  this  may  be  a  ma¬ 
jor  difficulty.  Furthermore,  in  these  applications  the  receiving  hardware  is  already 
busy  with  multimedia  processing  and  cannot  spend  precious  time  and  resources  in 
authentication.  In  teleconferences  this  may  also  apply  to  senders.  Moreover,  digital 
signatures  bring  non-repudiation,  which  may  not  be  desired  or  necessary  in  some 
cases.  Second,  symmetric  message  authentication  codes  (MACs),  which  are  a  lot 
faster,  could  be  used  and  should  be  of  no  concern  from  the  point  of  view  of  efficiency. 
But  then  the  broadcaster  needs  a  different  shared  key  for  each  receiver.  This  causes 
two  problems:  (1)  every  block  needs  to  be  authenticated  with  as  many  MACs  as 
there  are  receivers  -  which  is  very  inconvenient,  e.g.,  for  TV-like  broadcast,  and  (2) 
key  management  is  very  complex. 

In  the  literature,  a  number  of  solutions  have  been  proposed  that  may  be  ade¬ 
quate  in  some  contexts.  With  the  choice  of  symmetric  authentication,  it  is  possible 
to  use  a  fixed  number  I<  <  N  of  MAC  keys,  that  does  not  depend  on  the  size 
N  of  the  multicast  group  [Canetti  and  Pinkas  1998;  Dyer  et  al.  1995].  The  sender 
knows  K  keys,  and  each  receiver  knows  Kf  2  keys,  chosen  at  random  and  distributed 
by  a  key  distribution  center.  Each  stream  block  is  authenticated  with  K  MACs, 
corresponding  to  the  K  keys,  and  receivers  check  the  MACs  for  which  they  have 
a  key.  Obviously,  receiver  collusion  can  lead  to  forged  individual  authentication 
codes.  More  work  is  available  for  the  choice  of  efficient  digital  signatures.  On¬ 
line/offline  signatures  [Even  et  al.  1989]  may  be  used  to  split  the  computation  into 
an  expensive  offline  phase,  and  an  efficient  online  phase  performed  when  the  data 
becomes  available.  Part  of  the  signature  can  be  performed  by  a  signature  server, 
without  compromising  authentication  or  non-repudiation  [Asokan  et  al.  1996].  One 
time-signatures  [Bleichenbacher  and  Maurer  1996;  Lamport  1979;  Merkle  1987]  are 
an  efficient  alternative  to  be  used  for  stream  authentication.  Gennaro  and  Rohatgi 
[Gennaro  and  Rohatgi  1997],  have  proposed  an  efficient  stream  signing  method  that 
is  based  on  a  chain  of  one-time  signatures.  Only  the  first  block  is  signed  with  a 
standard  digital  signature,  and  includes  a  one-time  public  key  to  be  used  in  the  sec¬ 
ond  step.  The  corresponding  one-time  secret  key  is  used  to  sign  the  second  block, 
together  with  a  one-time  public  key  used  in  the  third  step.  The  process  is  contin¬ 
ued  until  the  end  of  the  stream  and  real-time  authentication  and  non-repudiation 
of  every  stream  prefix  is  guaranteed.  The  disadvantage  of  this  approach  lies  in  the 
length  of  the  authentication  information.  However  this  length  does  not  depend  on 
the  number  of  participants,  and  the  technique  scales  up  well  to  very  large  multicast 
groups  and  even  broadcast  applications.  A  similar  scheme  has  also  been  applied  to 
authentication  in  routing  protocols  [Zhang  1998]. 

The  problem  of  lengthy  authentication  information  is  solved  by  the  Guy  Fawkes 
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protocol  [Anderson  et  al.  1998]  which  uses  hash  functions  to  bind  a  sequence  of 
events  to  an  initial  value  and  to  guarantee  the  sequence  integrity.  Establishing  by 
any  means  (i.e.,  a  digital  signature)  the  identity  of  the  entity  that  knows  the  initial 
value  allows  to  secure  authentication  of  all  the  subsequent  events.  Thus,  the  sender 
first  commits  to  a  string  of  the  form  £,  =  {Mi,  Xi,  ft(Xi+1)},  where  Mi  denotes 
message  i,  Xi  stands  for  a  random  number  used  as  codeword.  This  commitment 
binds  the  message  to  the  codeword  and  its  successor.  The  sender  then  reveals  the 
value  of  this  string,  proving  her  knowledge  of  the  codeword  and  thus  authenticating 
herself.  To  avoid  a  man-in-the-middle  attack  by  simply  intercepting  Zi+i  and  the 
following  message  that  reveals  the  codeword  and  by  doing  so  being  able  to  insert, 
without  been  detected  by  the  recipient,  a  forged  message  in  the  stream,  the  first 
random  secret  number  (codeword)  needs  to  be  bootstrapped  by  using,  for  example, 
a  digital  signature. 

Guy  Fawkes  was  an  important  milestone  but  suffered  from  two  major  limitations. 
First,  the  sender  need  to  be  acknowledged  about  the  reception  of  the  packet  by  the 
receiver  before  she  can  send  the  next  packet.  This  requirement  fits  well  protocols 
like  TCP  but  it  is  expensive  to  implement  in  protocols  like  ICMP  that  do  not 
require  acknowledgement  for  each  packet  sent.  Publication  of  a  key  before  due 
acknowledgement  allows  an  opponent  to  forge  the  rest  of  the  stream.  Second,  the 
protocol  does  not  tolerate  packet  loss. 

Cheung  [Cheung  1997]  uses  a  method  very  similar  to  Guy  Fawkes  to  efficiently 
authenticate  link  state  updates  (LSU)  that  are  periodically  distributed  to  routers. 
In  this  application  the  message  is  always  an  LSU  that  describes  the  status  of  the 
links  incident  to  the  routers.  The  protocol  uses  an  optimistic  approach  in  the 
sense  that  the  LSU  is  immediately  accepted  as  valid  when  is  received  by  the  router 
and  after  a  tuneable  but  fixed  delay  the  correspondent  key  is  expected  so  that 
the  accepted  LSU  can  be  actually  validated.  The  method  does  not  distinguish 
between  forged  LSUs  and  the  case  where  the  packet  containing  the  validation  key 
is  simply  lost  but  the  correspondent  LSU  is  genuine.  This  is  an  important  limitation 
because  the  losses  of  packets  in  large  networks  are  very  likely  to  happen.  Besides, 
this  method  requires  some  degree  of  synchronization  among  all  routers  that  is  not 
easy  to  achieve  in  practice.  The  author  does  not  provide  any  assurance  about  the 
correctness  of  the  scheme. 

More  recently  Wong  and  Lam  in  [Wong  and  Lam  1998],  introduced  a  chaining 
technique  based  on  Merkle  trees  [Merkle  1987],  suitable  to  authenticate  in  efficient 
way  both  real-time  and  non  real-time  streams.  Their  approach  is  in  some  respect 
similar  to  ours,  but  we  use  as  primitive  the  hash-chains  introduced  by  Lamport 
[Lamport  1979]  instead.  Similarly  to  us,  they  use  a  single  digital  signature  for  a 
block  of  packets.  However  their  signing  operation  even  if  delay-bounded  requires  a 
higher  delay  compared  to  our  signing  operation  because  we  do  not  need  to  build  a 
tree  before  starting  to  compute  packet  digests  but  each  packet  digest  is  indepen¬ 
dent  from  the  others  [Bergadano  et  al.  ;  Bergadano  et  al.  2000a;  Bergadano  et  al. 
2000b].  Our  verification  procedure  is  also  more  efficient.  We  need  to  compute  a 
constant  number  of  packet  digests  for  each  verification  (2)  while  they  need  to  com¬ 
pute  0(log(d))  digests,  where  d  is  the  height  of  the  tree.  Another  very  important 
difference  between  all  the  solutions  we  have  presented  so  far  and  our  scheme  is  that 
they  are  not  robust.  In  this  scheme,  if  a  packet  is  lost  the  authentication  of  all  the 
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remaining  packets  of  the  tree  that  have  not  yet  been  authenticated  is  lost.  With  our 
algorithm  if  we  lose  a  packet  only  a  single  packet  of  data  will  be  affected,  further¬ 
more  our  chaining  technique  can  continue  without  disruption  and  can  guarantee 
authentication  of  the  following  packets  of  the  chain. 

Recently  Perrig  et  al.  [Perrig  et  al.  2000]  introduced  two  schemes  TESLA  and 
EMSS  that  have  been  developed  from  ideas  introduced  in  the  Guy  Fawkes  protocol 
[Anderson  et  al.  1998].  The  first  of  their  schemes,  TESLA  is  very  similar  to  one  of 
our  solutions. 

To  the  best  of  our  knowledge,  all  these  groups  have  been  working  independently. 

3.  CHAINED  STREAM  AUTHENTICATION  (CSA) 

A  stream  is  a  long  sequence  (of  undefined  length)  of  bits  that  a  sender  sends  to  one 
or  more  receivers.  A  stream  is  different  from  a  message  in  the  sense  that  it  must 
be  processed  by  the  receivers  at  almost  the  same  rate  it  is  received.  Examples  of 
digital  streams  are  the  video  and  audio  sent  over  the  Internet. 

This  section  presents  an  authentication  protocol  for  entities  that  continuously 
exchange  data.  Then,  its  security  properties  are  shown. 

Following  the  formalization  of  [Goldwasser  et  al.  1988]  and  [Gennaro  and  Rohatgi 
1997],  we  give  the  following  definitions: 

Definition  1.  [Negligibility]:  define  a  security  parameter ,  n ,  and  say  that  a  func¬ 
tion  c(n)  is  “negligible”,  if,  for  all  constants  c,  there  is  no  such  that,  for  n  >  no, 
e(n)  <  l/nc. 

Definition  2.  [Signature  scheme]:  a  signature  scheme  is  a  triple  (G,  Sig,  V)  of 
probabilistic  polynomial  time  algorithms,  where  (1)  G  is  used  to  generate  a  key 
pair  (SK,  PK),  where  SK  is  the  private  key  and  PK  is  the  public  key,  (2)  Sig  is 
used  to  sign  any  message  M,  using  the  secret  key  SK,  and  (3)  V  is  the  signature 
verification  algorithm,  such  that  V(PK,  M,  Sig(SK,  M))  =  1,  for  any  message  M. 

We  will  use  a  signature  scheme  that  is  secure  against  adaptively  chosen  message 
attacks  [Goldwasser  et  al.  1988]:  the  probability  of  forging  a  signature  is  negligible, 
even  when  a  signature  oracle  is  available. 

Similarly,  we  give  the  following 

Definition  3.  [Stream  authentication  scheme]:  a  stream  authentication  scheme  is 
a  triple  of  probabilistic  polynomial-time  algorithms  (GA,  AA,  VA),  where 

— On  input  ln,  GA  outputs  a  pair  of  keys  (SK,  PK)e  {0,  l}2n. 

— AA  is  the  authentication  algorithm,  and  receives  in  input  a  secret  key  SK,  and  a 
stream  S  =  Si ,  52, S,-,  consisting  of  a  finite  number  i  of  blocks.  AA  outputs  an 
authenticated  stream  S'  =  where  S'-  =  (Sj ,  authj) ,  being  authj  some 

kind  of  authentication  data. 

— The  verification  algorithm  VA  is  such  that  VA(PK,  AA(SK,  S))  =  1.  When 
VA(PK,  S')  =  1,  we  will  say  that  S'  is  valid. 

Assuming 

\T\  :  length  of  stream  T\ 

S<r>  :  r-th  stream  in  a  set  of  streams  {S^, . . . ,  S^n^}; 

hk(a)  =  hfc”1(/i(a))  :  hash  function  h  applied  k  times  to  initial  value  a; 
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M AC p(S)  :  Message  Authentication  Code  computed  on  data  S  with  key  /?; 

SN:  session  number  (comprising  the  set  of  identifiers  of  the  participants  in  the 
case  of  I-CSA  protocol); 

Sig(SKA,a,0)  :  signature  performed  by  A  on  concatenation  of  data  a  and  /?, 
where  SKa  is  the  private  key  of  A, 

we  may  now  define  our  new  proposed  scheme. 

Definition  4.  [Chained  stream  authentication  scheme]:  a  chained  stream 
authentication  scheme  is  a  stream  authentication  scheme  where: 

— As  a  generator  GA,  we  use  the  generator  of  a  signature  scheme  (G,  Sig,  V),  secure 
against  chosen  message  attacks. 

—The  authentication  algorithm  AA  will  be  called  a  Chained  Stream  Authentication 
algorithm  (CSA).  This  algorithm  first  generates  a  secret  a,  computes  hk(a)  for 
some  k  >  i}  and  then  produces  the  following  output: 

S[  =  Su  MACh>-i{a)(Sl)ihk(a)iSN,Sig(SK,hk(a),SN) 

S'2  =  S2,  MAC/ifc-a(a)(52)l hk~l(a) 


SI  =  MACh-nQ)[Si),hk-*+'{a) 

By  MAC,  we  denote  a  secure  Message  Authentication  Code,  e.g.  such  that  the 
probability  of  forging  a  valid  code  is  negligible,  even  when  a  MAC  oracle  is  avail¬ 
able.  For  h,  we  will  use  a  collision  resistant  hash  function.  The  authentication 
includes  a  session  number  SN  that  is  incremented  for  every  new  stream. 

—The  verification  algorithm  VA  will  output  1  if  the  initial  asymmetric  signature 
is  valid,  if  all  the  MACs  are  correct,  and  if  the  hash  chain  of  the  MAC  keys  is 
consistent,  i.e.  the  hash  of  a  key  produces  the  previous  key  in  the  chain. 

3.1  Security  against  Continuations 
We  define  continuations  as  follows: 

Definition  5.  [Continuation  of  a  stream]:  A  stream  S2  is  a  continuation  of  a 
stream  Si,  denoted  by  Si  C  S2,  if  Si  is  a  proper  prefix  of  S2. 

The  same  definition  applies  to  authenticated  streams  under  (GA,  AA,  VA).  A 
valid  authenticated  continuation  S£  of  an  authenticated  stream  SJ  must  then  be 
such  that  Si  C  SJ  and  VA(PK,  S£)  =  1. 

What  are  the  security  properties  of  the  proposed  scheme?  Clearly,  given  a  stream 
authentication  oracle,  it  is  possible  to  forge  new  valid  authenticated  streams,  be¬ 
cause  the  MAC  keys  become  known.  Therefore,  CSA  is  not  “secure”  according  to 
the  definition  of  [Gennaro  and  Rohatgi  199?],  that  can  be  rephrased  as  follows: 

Definition  6.  A  stream  authentication  scheme  (GA,  A  A,  VA)  is  secure  if  any 
probabilistic  polynomial-time  algorithm  F,  given  as  input  the  public  key  PK  and 
adaptively  chosen  authenticated  streams  outputs  a  new  valid  authenticated 
stream  S'  £  S'^\  for  all  j,  only  with  negligible  probability. 

This  definition  of  security  obviously  does  not  apply  to  the  CSA  scheme:  the 
forger  F  may  ask  for  just  one  authenticated  stream,  change  any  block  but  the  last, 
and  recompute  the  corresponding  MACs  using  the  available  keys. 
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However,  our  scheme  satisfies  a  weaker  security  notion,  which  we  will  call  “se¬ 
curity  against  continuations”.  Security  against  continuations  means,  informally, 
that  it  is  unfeasible  to  produce  valid  continuations  of  observed  valid  streams.  This 
weaker  notion  will  nevertheless  be  sufficient  for  building  secure  authentication  pro¬ 
tocols,  after  some  means  of  sender/receiver  synchronization  is  achieved,  as  described 
in  Sections  4.1  and  4.2.  More  precisely,  security  against  continuations  corresponds 
to  the  following: 

Intuition  1 .  A  forger  may  produce  a  new  valid  stream  S  only  if  it  is  associated 
to  a  stream  T,  that  was  used  previously.  Moreover,  if  |5|  =  |T|,  and  the  last  blocks 
of  S  and  T  are  different,  then  it  will  be  impossible  to  produce  a  valid  continuation 
of  5. 

Next,  we  formalize  the  above  intuition  and  prove  that  it  applies  to  CSA  (in 
Lemma  1). 

Definition  7.  [Security  against  continuations]:  A  stream  authentication  scheme 
is  secure  against  continuations  if  there  is  no  polynomial  time  algorithm  F  that, 
given  adaptively  chosen  authenticated  streams  is  able  to  generate  a 

valid  authenticated  stream  S'  =  51,..., S'-  with  non- negligible  probability,  unless 

3 *  g  [l,k]  such  that  =  Sf\  MAC[,  H,  SN,  Sig{SK,H,SN),  where  S[  =  Su 
MACi,  H,  SN ,  Sig(SI< ,  H ,  SN),  and  one  of  the  following  holds: 

(1)  j  <  or 

(2)  j  =  |S*W|,  and  SJ  =  Sf  \  or 

(3)  j  =  |S'(’)|,  and  S'-  ^  and  there  is  no  polynomial  time  algorithm  F’  that 
can  generate  with  non-negligible  probability  a  valid  continuation  T'  D  S' ,  given 

S'(k\  possibly  other  adaptively  chosen  authenticated  streams,  and  any 
valid  authenticated  continuation  of  S'^\ 

Security  against  continuations  is  a  complicated  notion,  but  it  will  lead  us  to  a 
simple  concept  of  stream  authentication  in  Theorem  1.  First,  though,  we  need  the 
following: 

Lemma  1.  Suppose  that,  in  the  CSA  authentication  scheme , 

(1)  (G,  S,  V)  is  a  secure  signature  scheme, 

(2)  g  is  a  pseudorandom  function , 

(3)  /t(ar)  —  fifar(O)  and  MACk{x)  =  gk(  1,  a?)  where  g  concatenates  all  its  parameters 
to  obtain  only  one, 

(4)  g  is  such  that  h  is  a  collision  resistant  hash  function. 

Then,  the  scheme  (GA,  CSA,  VA)  is  secure  against  continuations . 

Proof,  (of  Lemma  1.) 

Suppose  that  a  forger  F  exists  that  can  produce  a  valid  authenticated  stream  S' 
with  a  non-negligible  probability  e,  contraddicting  the  thesis.  Then,  one  of  the  two 
following  cases  must  hold,  and  at  least  one  must  hold  with  probability  at  least  e/2: 
Case  1  S[  =  SuMACi,H,SN9Sig(SK,H,SN)  and  there  is  no  S'W  such  that 

Siw  =  S[{i),MAC[i),  H,  SN,  Sig(SK,  H,  SN). 

Then,  we  can  use  the  forger  F  to  construct  algorithm  FI  that  breaks  the  asym¬ 
metric  signature  scheme  (G,  S,  V).  The  constructed  algorithm  FI  has  access  to  an 
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oracle  for  S,  and  starts  by  calling  F  as  a  subroutine.  When  F  requires  an  authen¬ 
ticated  stream  Sf^\  FI  generates  a  random  secret  computes  hk(a^),  and 
asks  the  oracle  for  Sig(SK,  hk(a^),  SN).  Then,  knowing  FI  authenticates 
the  rest  of  the  stream  as  required  by  CSA,  and  outputs  the  authenticated  stream 
S'(t)  for  F.  The  process  continues  until,  with  probability  at  least  e/2,  F  outputs 
the  new  valid  authenticated  stream  S' .  In  this  case  (Case  1),  S'  must  be  such  that 
S{  =  Si,  MAC\,H, SN,  Sig(SK ,  H,  SN)  and  there  is  no  S'W  generated  by  FI  such 
that 

Si(i)  =  S'l{i),MAC[i),H,SN,Sig(SK,H,SN). 

This  means  that  a  signature  Sig(SI< ,  H ,  SN)  was  never  queried  by  FI  to  the  ora¬ 
cle.  Hence  FI  breaks  (S,  G,  V)  by  outputting  the  new  valid  signature  H ,  SN,  Sig(SI< ,  H,  SN ). 

Case  2  S[  =  Sy ,  MACy  ,H,SN,  Sig(SK ,  H,  SN)  and  there  is  S'W  such  that 

S*^  =  Sfyl\MACy\  H,  SN,  Sig(SK,  H,  SN).  Then  there  are  three  cases,  and  one 
must  hold  with  probability  at  least  c/6: 

Case  2.1  |S'|  <  |S'^|.  This  case  does  not  contraddict  the  Lemma. 

Case  2.2  |S'|  >  \Sr^\.  In  this  case  we  can  construct  F2  that  can  invert  h,  using 
the  forger  F,  and  hence  break  g.  The  inverter  F2  is  given  a  value  a  and  must  com¬ 
pute  h~l(a)  with  non-negligible  probability.  F2  calls  GA  to  generate  a  key  pair 
(SK,  PK),  and  then  runs  F  as  a  subroutine.  Let  SNmax  be  the  maximum  number 
of  authenticated  streams  that  F  will  ask  for.  Clearly  SNmax  must  be  polynomi¬ 
al^  large.  F2  will  then  pick  a  random  number  R  between  1  and  SNmax-  When 
asked  to  authenticate  stream  S^R\  F2  lets  =  a  and  then  follows  the  CSA 
authentication  algorithm,  but  choses  k  =  |5^|.  When  F  stops,  it  will  output  S' 
such  that  |S'|  >  |S'W|,  with  probability  greater  than  e/6,  and  i—R  with  proba¬ 
bility  1  /SNmax-  Hence,  with  probability  greater  than  e/(6  *  SWmax),  Sfs,(i)1+1  = 

(S,MAC,  /,*-(|S'(i)l+D(a))  =  =  (S^AC.h-^a)).  Since 

h{x)  =  <7^(0),  inverting  h  means  breaking  gkey  after  observing  gkey{0). 

Case  2.3  |S"|  =  =  j.  There  are  two  cases,  and  one  must  hold  with 

probability  at  least  c/12: 

Case  2.3.1  Sj  =  Sj*\  This  case  does  not  contraddict  the  Lemma. 

Case  2.3.2  Sj  ^  sj*\  Let  Sj  —  Sj,MACkey(Sj),hk~^l(a).  There  are  two 
cases,  and  at  least  one  must  occur  with  probability  at  least  c/24: 

Case  2. 3.2.1  M ACkey{Sj),  generated  by  F,  and  M AC(Sj^),  given  in  stream 
S'W,  are  valid  under  the  same  key.  In  this  case  we  can  use  F  to  break  guk>  for 
some  unknown  key  uk,  in  a  polynomial  algorithm  F3,  that  has  access  to  an  oracle 
for  guk .  Define  again  SNmax  as  the  maximum  number  of  authenticated  streams 
that  F  will  ask  for.  F3  will  then  pick  a  random  number  R  between  1  and  SNmax . 

F3  will  run  F  as  a  subroutine,  will  use  GA  to  generate  a  key  pair  (SK,  PK),  and 
will  authenticate  all  streams  requested  by  F  normally  using  CSA,  except  for  stream 
S(R\  For  this  stream,  define  /  =  \S^\,  and  let  =  uk,  and  k  =  l.  Then, 
/tk~*+1(o//^):=  h(uk).  F3  then  computes  hl~l(uk), ...,  h(uk),  after  querying  the 
oracle  for  h(uk)  =  gUk{ 0),  and  uses  these  values,  in  this  order,  to  compute  the 
MACs  for  as  required  in  CSA.  As  the  last  authenticated  block,  F3 

outputs  s;(fl)  =  (SlR),MACuk(SlR)),h(uk)),  where  M4Cufc(S,(fi))  = 

is  queried  to  the  oracle.  With  probability  l/SNmax,  the  authenticated  stream  S' 
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output  by  F  is  associated  to  stream  i.e.,  i  =  R .  In  this  case,  M AC«fc(5j^) 

and  MACkey(Sj ),  are  valid  under  the  same  key,  i.e.,  key  =  uk .  F3  then  outputs 
MACuk(Sj)  =  and  since  Sj  #  sj'\  this  is  a  new  forged  MAC,  and  a 

correct  value  of  guk  for  the  new  input  (l,Sj),  that  is  generated  with  non-negligible 
probability  greater  than  c/ (24  *  SNmax)- 

Case  2.3.2.2  MACkey(Sj),  generated  by  F,  and  MAC(S^),  given  in  stream 
5/(0j  are  not  valid  under  the  same  key.  We  show  that,  in  this  case,  there  is  no 
polynomial  time  algorithm  F’  that  can  generate  with  non-negligible  probability 
a  valid  authenticated  continuation  V  D  S',  given  S^l\ ...,  S'(k\  possibly  other 
adaptively  chosen  authenticated  streams  T'(l\  and  any  valid  continuation 

of  S'^.  Suppose  such  a  forger  F’  exists,  and  does  the  above  with  non-negligible 
probability  e'.  Let  2J+1  =  MAC,  key).  Since  T  is  valid  and  is  a  continuation 

of  5",  we  also  know  that  T<  =  S'j  =  (Sj,  MACkey(Sj),  hk~j+l(a)).  We  construct 
F4  that  can  generate  collisions  for  h  with  non-negligible  probability,  using  F  and 
F’.  F4  starts  by  generating  a  key  pair  (SK,  PK).  Then,  it  runs  F  as  a  subroutine 
and  authenticates  the  requested  streams  normally  using  CSA.  F  will  then  output 
S'  satisfying  the  conditions  of  this  case.  We  also  know  that  a  continuation  forger  F7 
exists  and  therefore  Sf  has  a  possible  valid  continuation.  Consequently  M  ACkey(Sj) 
is  a  valid  MAC  for  some  value  of  key,  and  for  the  conditions  in  this  case,  key  ^ 
hk-3(a).  This  all  happens  with  probability  greater  than  e/24.  In  order  to  obtain 
key ,  and  thus  obtain  a  collision  for  h ,  F4  must  then  run  F’  as  a  subroutine.  F4 
will  authenticate  additional  streams  asked  by  F’  normally,  using  CSA,  and  will 
then  produce  a  continuation  of  as  requested  by  F7,  also  using  CSA.  When 
F’  outputs  a  valid  continuation  T'  of  S',  this  must  include  the  value  of  key,  and 
F4  outputs  (key,  hk~j(a))  as  a  collision  for  h.  This  must  occur  with  non-negligible 
probability  (greater  than  e  *  e'/24).  □ 

MAC  and  h  are  of  the  CSA  algorithm,  and  are  defined  through  a  pseudorandom 
function  [Goldreich  et  al.  1986;  Bellare  et  al.  1996]  as  defined  in  (3)  above,  because 
not  only  should  the  MAC  be  secure,  but  each  key  k  must  look  random  even  though 
h(k)  is  known.  With  the  definition  of  (3),  knowing  h(k)  =  gk( 0)  gives  no  additional 
information,  as  one  could  in  any  case  query  the  oracle  for  gk  and  obtain  flffc(O).  In 
practice,  one  could  use  g  =  HMAC  [Krawczyk  et  al.  1997]  so  as  to  satisfy  both  (2) 
and  (4) . 

4.  INTERACTIVE  CSA  (1-CSA) 

We  will  now  use  the  CSA  scheme  to  authenticate  information  over  an  insecure 
network,  as  initially  presented  in  [Bergadano  et  al.  2000a].  In  fact,  CSA’s  security 
against  continuations  can  be  used  with  a  synchronization  mechanism  to  obtain  a 
very  efficient  individual  authentication  method. 

4.1  The  Chained  Stream  Authentication  Protocol  with  one  Sender  and  one  Receiver 

For  now,  we  consider  one  party,  named  A,  who  will  send  authenticated  data,  and 
one  party,  named  B,  who  will  receive  the  data.  The  protocol  is  defined  below,  where 
Sig(SI<A,x,y)  is  A’s  signature  under  (G,  S,  V),  and  similarly  Sig(SKB,x,y)  for 
B: 
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1.  B  A:  hk(0),SN}Sig(SKB,hk(l3)1SN) 

A->  B:  A1,MACh*-i{a)(Ai),hk{a),SN,  Sig(SKAihk(a)tSN) 

2.  B  ->  A:  h fc-1(/?) 

A  ->  B:  A2,MAChk-2{a)(A2),hk-l(a) 

i.  B4  A:  /ifc-i+1(/?) 

A-^B:  A, ,  Af  AC7^-i(<K)(^)9  fc*-*+1(a) 


Messages  are  sequential:  A  will  not  send  message  i  if  it  has  not  received  a  correct 
i-th  message  from  B,  and  B  will  not  send  message  i  +  1  if  it  has  not  received  from 
A  a  correct  f-th  message.  A  and  B  initially  generate  individual  random  secrets  a 
and  /?,  and  compute  hk(a)  and  hk(0)}  respectively.  These  values,  and  the  session 
number  SN,  are  signed,  and  exchanged  as  part  of  the  first  messages  in  step  1.  To 
avoid  a  possible  man-in- the-middle  attack,  the  session  number  must  also  contain 
the  identifiers  of  the  entities  participating  in  the  protocol.  An  explanation  for 
this  constraint  will  be  given  in  section  4.3.  Then,  A  sends  data  as  defined  in  the 
CSA  scheme,  and  B  sends  back  authenticated  acknowledgments.  B’s  authenticated 
ack  for  A’s  ;-th  message  is  simply  hk^(fi).  The  receiver  side  is  then  similar  to 
what  happens  in  S/Key  and  similar  applications  [Haller  1994;  Lamport  1981].  The 
security  properties  of  the  protocol,  to  be  discussed  next,  are  made  relative  to  the 
following: 

Definition  8.  [Active  Attack  Model]:  the  Active  Attack  Model  with  CSA 
sender  A,  CSA  receiver  B,  and  attacker  E  is  the  following: 

— E  runs  in  polynomial  time  and  may  ask  A  to  send  B  the  authenticated  streams 
...,  in  sessions  1, k. 

— A  chooses  stream  and  sends  it  to  B  in  session  k  +  1. 

— At  any  time,  E  can  read  messages,  stop  messages  and  insert  messages. 

—During  session  fc+1,  E  tries  to  have  B  receive  S'  £  S'(fc+1)  and  believe  it  authentic. 

We  call  the  above  an  active  stream  authentication  attack.  We  shall  prove  in  The¬ 
orem  1  that  such  attacks  are  not  feasible  with  the  above  CSA  protocol,  except 
for  the  possible  falsification  of  the  last  block  of  We  first  note  that  session 

numberings  by  sender  and  receiver  are  consistent: 

Observation  L  Suppose  A  has  sent  its  first  message  of  session  SN.  Then  B  must 
have  already  sent  its  first  message  of  session  SN . 

Proof,  (of  Observation  1.)  We  construct  an  algorithm  F  that  simulates  A  and 
B  over  an  insecure  network,  where  E  can  perform  active  attacks.  F  controls  the 
simulations  of  A  and  B  out  of  band,  i.e.  over  a  secure,  separate  channel.  Suppose 
that,  with  non-negligible  probability,  E  has  taken  action  so  that  B  has  not  yet  sent 
the  first  message  of  session  SN ,  when  A  has  already  sent  its  first  message  of  session 
SN.  Then  we  construct  F  so  that  it  can  forge  signatures  under  the  asymmetric 
scheme  (G,  S,  V).  F  simulates  A  normally,  by  first  calling  GA  to  generate  a  key 
pair  (SKa>PI<a)-  For  simulating  B,  F  does  not  generate  a  key  pair,  but  relies  on 
the  oracle  for  the  signature  required  in  the  first  message  of  every  session.  At  some 
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point,  F’s  simulation  of  A  must  have  computed  the  first  message  of  session  SN, 
and  it  must  therefore  have  received  Sig(SKs ,  H,  SN).  However,  we  have  supposed 
F’s  simulation  of  B  has  not  yet  computed  the  first  message  of  session  SN .  As  a 
consequence,  Sig(SKB ,  H,  SN)  was  never  queried  to  the  oracle,  and  can  be  output 
as  a  forged  signature.  □ 

Observation  2.  Suppose  B  has  sent  its  second  message  of  session  SN .  Then  A 
must  have  already  sent  its  first  message  of  session  SN. 

PROOF,  (of  Observation  2.)  Under  the  same  setting  of  Observation  1,  F  simulates 
B  normally,  by  first  calling  GA  to  generate  a  key  pair  (SI<b  ,  P Kb)-  For  simulating 
A,  F  does  not  generate  a  key  pair,  but  relies  on  the  oracle  for  the  signature  required 
in  the  first  message  of  every  session.  At  some  point  F’s  simulation  of  B  must 
have  sent  the  second  message  of  session  SN,  and  therefore  it  must  have  received 
Sig(SKA,H,  SN).  However,  since  F’s  simulation  of  A  has  not  yet  begun  running 
session  SN,  it  has  not  yet  computed  the  first  authenticated  block.  As  a  consequence, 
Sig(SKA ,  H ,  SN)  was  never  queried  to  the  oracle,  and  can  be  output  as  a  forged 
signature.  □ 

The  observations  allow  us  to  speak  of  a  “current  session”  in  the  CSA  protocol 
with  one  sender  and  one  receiver.  We  can  now  prove  that  this  protocol  represents 
a  valid  authentication  mechanism: 

Theorem  1.  (Security  of  CSA  with  one  sender  and  one  receiver): 
Suppose  sender  A  and  receiver  B  run  SN  sessions  under  the  CSA  protocol ,  where: 

—the  conditions  of  Lemma  1  hold; 

— a  polynomial  active  attacker  E  chooses  streams  ...,  a(sn~1)}  that  A  sends  to 
B  in  sessions  1  ,...,SN  —  1; 

—B  has  received  the  valid  authenticated  stream  S'  =  S{,  ...,Sj  during  session  SN. 

Then ,  E  can  cause  S[,...,  Sj_i  to  be  non-authentic  only  with  negligible  probability. 

We  shall  now  prove  Theorem  1  by  induction  on  \S'\,  based  on  Lemma  1.  However, 
we  first  need  the  following,  that  characterizes  the  information  available  to  an  active 
attacker  at  any  given  moment: 

Lemma  2.  Suppose  A  and  B  run  session  SN,  where: 

(1)  (G,  S,  V)  is  a  secure  signature  scheme ,  and 

(2)  h  is  a  one-way  hash  function 

Suppose  also  that  B  has  received,  during  session  SN,  the  valid  authenticated  stream 
S'  —  S[,...,Sj_i,  and  no  more.  Then,  there  is  no  active  attacker  E  that  can, 
with  non-negligible  probability,  cause  A  to  release  more  than  j  authenticated  blocks 
during  session  SN. 

Proof,  (of  Lemma  2.) 

Let  parties  A  and  B  run  session  SN  of  the  CSA  protocol  with  one  sender  and 
one  receiver.  Suppose  the  Lemma  is  false.  Then  there  is  an  active  attacker  E  that 
can,  with  non-negligible  probability,  cause  a  situation  where  A  has  released  the 
stream  A1  =  A[, ...,  while  B  has  only  received  j  —  1  blocks  S[ , ...,  Sj_x.  Since 
A  has  released  j  +  1  blocks,  it  must  have  received  authenticated  acknowledgements 
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hk(P)} ....  hk  There  are  two  cases,  and  one  must  hold  with  probability  at  least 

c/2: 

Case  1:  0  £  0(SN\  the  secret  value  chosen  by  B  for  session  SN.  Then,  we 
show  that  we  can  construct  an  algorithm  FI  that  can  simulate  A  and  B,  and  use 
a  signature  oracle  to  forge  signatures  under  (G,  S,  V).  FI  simulates  A  by  running 
the  sender  side  of  the  CSA  protocol.  FI  simulates  B  by  running  the  receiver  side 
of  the  CSA,  but  uses  the  signature  oracle  to  produce  the  signatures  needed  in  the 
first  authenticated  block  of  each  session.  When  E  has  caused  the  situation  covered 
by  this  case,  FI’s  simulation  of  A  must  have  received  Sig(SK,  0,  SN),  and  since 
ft  zfc  this  can  be  output  as  a  new  forged  signature. 

Case  2:  0  =  0^SNK  Then,  we  can  construct  F2  that  can  compute  /i"l(x),  for 
any  x,  with  non-negligible  probability.  F2  will  simulate  A  and  B,  running  the  CSA 
protocol  over  a  network  where  E  can  perform  active  attacks.  F2  simulates  A  by 
running  the  sender  side  of  the  CSA  protocol.  F2  simulates  B  by  running  the  re¬ 
ceiver  side  of  the  CSA,  but  first  sets  the  following  values: 

-  pick  SN  at  random  between  1  and  SNmaX)  where  SNmax  is  the  maximum  number 
of  sessions  that  the  attacker  is  able  to  cover; 

-  pick  j  at  random  between  1  and  jmaxi  where  jmax  is  the  maximum  number  of 
blocks  per  session  in  E’s  active  attacks; 

-  let  k  =  j  —  1  and  =  fcJ(x); 

B  is  then  able  to  send  to  A  all  acknowledgments  required  for  receiving  the  first 
j  -  1  blocks,  i.e.,  hk(p(SN))  =  ti*i-l{x),...fhk-U-l\0lSN))  =  x.  When  E  has 
caused  the  situation  covered  by  this  case,  F2’s  simulation  of  A  must  have  received 
hk~j(flSN))  =  h~l(x):  This  happens  with  non-negligible  probability  e/2jmaxSNmax 

Proof,  (of  Theorem  1.) 

We  prove  a  stronger  claim  by  induction  on  |S'|  =  j,  namely:  under  the  conditions 
of  the  Theorem,  the  thesis  holds  and,  if  S'  is  non- authentic,  then  E  can  cause  B  to 
receive  a  valid  continuation  of  S'  only  with  negligible  probability. 

Base:  We  show  that  the  claim  is  true  for  j  =  1.  Before  B  has  received  from  A 
the  first  message  of  session  SN,  by  Lemma  2,  A  has  released  no  more  than  the  first 
block  of  session  SN.  Therefore,  the  information  available  to  E  consists  of: 

-  the  previous  authenticated  streams  A'W, ...,  sent  by  A,  and 

-  the  first  block  of  session  SN. 

Let  S{  =  Si,MACi,H,SNtSig[SK,  H,SN).  Since  S'  is  valid,  by  Lemma  1,  there 
is  q  e  [i,SN]  such  that  ^  =  S[q) ,  M AC[q\  H , SN,  Sig{SK,  H,  SN).  Since  se- 
quence  number  SN  is  only  used  in  A^SN\  clearly  the  only  value  for  q  is  SN.  Also 
by  Lemma  1,  if  SJ  ^  A*iSN\  then  there  is  no  polynomial  algorithm  that  can  gener¬ 
ate  a  valid  authenticated  continuation  of  S',  given  adaptively  chosen  streams,  and 
any  valid  continuation  of  A' .  So  neither  can  E,  even  after  obtaining 
Inductive  step:  Suppose  that  the  inductive  claim  is  true  for  j  —  1.  We  prove  that 
it  is  also  true  for  j.  In  order  to  do  so,  suppose  B  has  received  the  valid  stream 
S'  =  (S{,  This  means  that  it  was  possible  to  produce  a  valid  continuation 

of  (Si, ...,  S'-.i),  and,  by  the  inductive  hypothesis,  must  be  authentic 

with  probability  1  -  77,  where  77  is  negligible.  We  now  have  to  prove  that,  if  Sj  is 
non-authentic,  then  E  can  cause  B  to  receive  a  valid  continuation  of  S'  only  with 
negligible  probability.  By  Lemma  2,  before  B’s  receipt  of  S'-,  A  has  released  at 
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most  j  blocks  during  session  SN.  Therefore,  the  information  available  to  E  before 
B’s  receipt  of  Sj  consists  of: 

-  the  previous  authenticated  streams  A'^1),  ...,  sent  by  A,  and 

-  the  stream  A'(sir>  =  A'£SN\ ...,  A'}SN)  sent  by  A  during  session  SN. 

Let  S[  =  Si,MACuH,SN,Sig{SK,H,SN).  Since  S'  is  valid,  by  Lemma  1, 
there  is  q  €  [1,  SN]  such  that  S[{q)  =  S[q),MAC[q\  H,  SN ,  Sig(SK,  H,  SN).  Since 
sequence  number  SN  is  only  used  in  A'*SN\  clearly  the  only  value  for  q  is  SN.  Also 
by  Lemma  1,  if  S'-  ^  A'jSN\  then  there  is  no  polynomial  algorithm  that  can  gener¬ 
ate  a  valid  authenticated  continuation  of  S',  given  adaptively  chosen  streams,  and 
any  valid  continuation  of  A'(5JV).  So  neither  can  E,  even  after  obtaining  Af^\  □ 

The  last  block  of  S'  may  be  modified  by  E.  Thus,  authentication  is  obtained 
with  a  one  block  delay:  the  receiver  can  ascertain  the  origin  and  the  integrity  of 
a  block  in  the  stream  only  after  receiving  the  next  block.  This  is  acceptable  in 
most  multicast  applications.  This  is  achieved  without  shared  secrets  and  without 
signatures  after  the  first  message.  The  important  consequences  of  this  fact  are 
discussed  in  the  next  session,  where  the  protocol  is  used  with  more  than  just  two 
parties. 

4.2  The  N-party  l-CSA  Protocol 

We  will  now  extend  the  protocol  so  that  it  can  be  used  effectively  in  a  multicast 
conferencing  scenario.  As  a  first  step,  we  will  consider  two  parties,  A  and  B, 
that  must  exchange  authenticated  data  in  both  directions.  Obviously,  this  can 
be  done  by  simply  applying  the  CSA  protocol  with  sender  A  and  receiver  B,  and 
simultaneously  with  sender  B  and  receiver  A.  However,  we  can  make  this  more 
efficient  by  merging  the  hash  values  sent  as  acknowledgments  and  the  ones  sent  as 
hashes  of  secrets.  This  results  in  the  following  two-party  protocol: 

1.  B  -+  A:  BuMACh^ii0){Bx)}hk{l3)}SN}Sig(SKB}hk{f3)1SN) 

A  B:  Au  SN ,  SigiSK  a  j  hk(cx)y  SN) 

2.  B  -+  A:  B2,MACh>-2{P)(B2),hk-l{0) 

A  B:  A2,  MACftfc-2(a)(A2),  hk'l(a) 

i.  B  -*  A:  Bi}  Af  ACA*-iW(Bi),/i*-f+109) 

A  B:  Aj,  MAC'ftfc-.i(a)(A,),/ife~1+1(a) 


The  above  protocol  may  cause  practical  transmission  difficulties.  In  particular, 
each  party  may  send  a  block  of  data  only  after  receiving  a  corresponding  block 
from  the  other  party.  This  causes  a  kind  of  stop-and-wait  behaviour  that  implies 
poor  network  utilization  and  may  result  in  unacceptable  delays  for  real-time  traffic. 
Fortunately,  such  strict  sequentialization  of  messages  is  not  necessary.  In  particular, 
data  blocks  and  MACs  can  be  sent  at  any  time  -  only  the  delivery  of  keys  need 
be  delayed  until  acknowledgements  are  obtained  and  verified.  We  may  therefore 
rewrite  the  above  two  party  protocol  by  splitting  the  behaviour  of  each  party  into 
a  data  sender  process  and  a  key  sender  process,  that  run  independently.  For  party 
A,  the  processes  are  defined  as  follows  (party  B  is  defined  symmetrically): 
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A ’s  data  sender  process: 

L  send  to  B:  M AChk-v(a)(A{),  Ai 
2.  send  to  B:  MAChh-7^a)(A2)1  A2 


A’s  key  sender  process: 

1.  wait  for  M AChk-\^(B\) 

send  to  B:  hk(a),SN ,Sig(SKA,hk(a),SN) 

2.  wait  for  Af  AChk^p)(B2) 

wait  for  hk(0),SN,  Sig(SI<B ,  hk{f3),  SN) 
send  to  B:  hk~l( a) 

3.  wait  for  M 
wait  for  hk~l(/3) 
send  to  B:  hk~2(a) 

Delaying  just  secrets,  not  information,  is  essential  in  multicast  conferencing:  the 
receiver  will  continue  viewing  the  stream  of  data  even  though  the  keys  necessary 
to  authenticate  it  are  not  yet  available.  When  secrets  are  late,  viewing  is  ahead  of 
authentication,  and  we  call  this  an  authentication  delay.  The  delay  would  be  small 
and  roughly  equivalent  to  three  times  the  network  latency.  The  reason  is  that  a 
MAC  must  be  sent,  then  the  authenticated  acknowledgement  is  returned,  and  finally 
the  MAC  key  is  sent.  Only  then  can  the  corresponding  block  be  authenticated  by 
the  receiver. 

We  may  now  generalize  the  above  construction  and  obtain  the  N-party  proto¬ 
col.  During  session  SN ,  party  i  first  generates  a  random  secret  a,*  and  computes 
hk(oti).  Party  i  consists  of  two  processes,  a  data  sender  and  a  key  sender ,  that  run 
concurrently  as  described  below,  where  Aij  is  the  jth  block  sent  by  party  i: 

Data  sender  i: 

1.  multicast  M ACh^i{ai)(Aiti)  and  Aj,i; 

2.  multicast  M  AChk-7(Qi)(Ait2)  and  Ai>2] 


Key  sender  i: 

1.  wait  for  M ACA*-i(ai)(Ai?*|1),  for  all  j  6  [1,  N]\ 
multicast  hk(ai),SN,  i, Sig[SI<i ,  hk(ai),SN,  i); 

2.  wait  for  M AChk-^ai)(Ah2),  for  all  j  e  [1,  A/]; 

wait  for  hk(aj),SN,j,Sig(SKj,hk(aj),SN,j),  for  all  j  6  [1,  JV]; 
multicast  /r*_1(a,); 

3.  wait  for  M AChk-^ai)(Ajfi),  for  all  j  £  [1,jV]; 
wait  for  hk~1(aj),  for  all  j  £  [1,  N]; 
multicast  /i*_2(a,); 
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It  is  important  to  note  that,  for  every  block,  the  only  authentication  information 
that  is  multicast  is  one  MAC  (sent  by  the  data  sender)  and  one  hash  value  (sent 
by  the  key  sender).  This  does  not  grow  with  N. 

4.3  Considerations  on  a  possible  man-in-the-middle  attack 

It  is  worth  to  note  that  indicating  explicitly  the  name  of  sender(s)  and  receiver(s)  in 
the  SN,  our  protocol  does  not  sufFer  from  man-in-the-middle  attacks  from  outsiders 
the  group  and  also  from  members  of  the  group.  For  simplicity  we  show  this  attack 
on  a  communication  of  two  parties,  A  and  B\  the  solution  to  the  attack  can  be 
generalized  to  the  N-party  protocol,  even  if  the  attacker  /,  that  impersonates  A, 
1(A) ,  is  a  member  of  the  group. 

Let  us  see  the  attack: 

1.  B  ->  1(A):  BuMAChk-i{0)(B1),hk(0),SN,Sig(SKB,hk(0),SN) 

1’.  I  -4  1(A):  h,MACh>-ih){Ii),hk(-f),SN,Sig{SK,,hk{-t),SN) 

A  -+  I:  AllMACh*-Ka)(A1),hk(a),SN,Sig(SKA,hk(a),SN) 

2’.  I A:  I2,  M AChk-i^(I2),  hk  *(7) 

A  -4  I:  A2,MAChk-Ha)(A2),  hk~l{a) 

2.  1(A)  -4  B;  A'1,MAChk-na)(A[),hk(a),SN,Sig(SKAlhk(a),SN) 
and  from  now  on,  /  can  forge  all  A’s  messages. 

This  attack  is  avoided  if  the  session  number  SN  also  contains  the  identifiers  of  the 
participants  in  the  communication;  these  identifiers  are  also  needed  by  the  partici¬ 
pants  because  the  protocol  requires  to  wait  the  signatures,  MACs  and  keys  from  all 
the  participants.  And  in  this  case,  A  would  not  accept  the  message  from  I  because 
it  is  signed  by  /  but  the  session  is  for  A  and  B .  In  the  multiparty  case,  an  intruder 
external  to  the  group  cannot  forge  any  message  for  the  reason  above.  Instead, 
a  participant  of  the  group  cannot  masquerade  as  another  participant  because  the 
shown  attack  requires  that  A  releases  two  keys  in  the  beginning,  but  A  will  not 
release  the  second  key  if  she  has  not  received  the  signatures  and  all  the  previous 
MACs  and  keys  from  the  other  participants.  In  this  situation,  forging  cannot  be 
performed  by  anyone. 

5.  TIMED  CSA  (T-CSA) 

In  this  section  we  apply  the  CSA  scheme  to  network  applications  where  receivers 
do  not  transmit  data  and  are  unable  to  send  acknowledgements  in  due  time.  This 
section  is  based  on  our  initial  presentation  in  [Bergadano  et  al.  2000b].  This  protocol 
works  with  one-directional  transmission  from  one  sender  to  any  number  of  receivers 
(broadcast  transmission).  Given  that  there  is  no  feedback  from  the  receivers,  the 
security  of  this  protocol  is  based  on  the  time  of  transmission  of  the  authentication 
information. 

The  technique  we  propose  here  is  especially  suited  for  efficient  data  authenti¬ 
cation  in  broadcast,  or  multicast  with  large  groups.  It  is  based  on  hash  chains, 
but  does  not  sufFer  from  the  delay-related  problems  of  the  Guy  Fawkes  protocol. 
The  authentication  scheme  is  detailed  below,  separately  for  the  broadcast  sender, 
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and  for  the  receivers.  The  sender  sends  data  Ai  and  authentication  information  in 
separate  streams  (corresponding  to  the  Data  Sender  and  Key  Sender  procedures 
defined  below).  The  sender  also  announces  the  session,  including  some  essential  se¬ 
curity  information.  Receivers  consume  data  and  verify  authenticity  in  two  separate 
processes  (Receiver  and  Authenticator). 

5.1  Sender 

Announcement : 
broadcast  the  following: 

session  number  SN, 
starting  time  ST, 

otk  =  hk(a)  (where  a  is  a  random  secret, 
h  is  a  collision-resistant  hash  function), 
maximum  time  imprecision  e3  >  0  at  sender; 
signature  Sig(SKA ,  SNt  ST ,  ak ,  e5) 

Data  sender: 

broadcast  M AChM^[Ai)t  A\  (at  time  ST) 
broadcast  M  AChk-*(a)(A2)}  A2  (at  time  ST+T) 


Key  sender: 

wait  until  time  ST+delay,  broadcast  ak _x  =  hk~l(a) 
wait  until  time  ST+delay-f-T,  broadcast  ak-2  =  hk~2(a) 
wait  until  time  ST+delay+2T,  broadcast  a^_3  =  hk~3(a) 


The  delay  is  needed  so  that  keys  are  not  made  public  too  early,  i.e.,  before  cor¬ 
responding  data  and  MACs  are  received.  A  good  choice  could  be 
delay=latency+e5  +  mair6fieceivcrj(er)  +  R, 
where  er  >  0  is  the  receiver  clock  precision,  i.e.  the  maximum  time-advance  im¬ 
precision  of  the  receivers’  clocks,  R  is  a  reliability  parameter,  and  latency  is  the 
estimated  maximum  propagation  time.  T  is  the  time  length  interval  of  each  data 
packet;  in  case  of  audio  transmission,  T  is  the  time  length  of  sampled  audio  that  is 
contained  in  one  packet. 

5.2  Receiver 

Receiver: 
repeat  forever: 

receive  X  (where  X  is  a  MAC,  a  data  block, 
or  a  key); 
mark  receipt  time; 
store  X; 

if  X  is  a  data  block,  consume  X; 

Authenticator: 

receive  session  announcement 


Individual  Authentication  in  Multiparty  Communications 


17 


SN, ST, oik,  ts,  Sig(SKA , SN,  ST, otk,e,) 
if  Sig(SKA ,  SN,  ST, ak,es)  is  wrong,  exit; 
for  i  =  1  to  k  do  : 

wait  until  all  of  the  following  is  received: 

MACak_,{Ai), 

At, 

if  h(ak~i)  =£  ak-i+i  then  exit, 
if  MACak_i(Ai)  is  late , 

mark  authentication  of  Ai  as  unknown; 
if  M ACak_i(Ai)  is  wrong, 
mark  Ai  as  forged; 
else  mark  A,  as  authentic; 


We  must  define  the  meaning  of  “late”  in  the  authenticator  procedure.  Informally, 
the  MAC  is  late  when  received  after  the  corresponding  key  was  released.  However, 
we  must  consider  that  sender  time  and  receiver  time  differ,  hence: 

MACak_i{Ai)  is  late  when. received  at  time  TM  >  ST  +  (i  —  1)  *T—ts  -er+delay. 

In  fact,  if  the  MAC  is  received  after  this  deadline,  then  the  key  could  have  been 
released  earlier,  and  with  that  key  anybody  could  have  generated  the  MAC.  In  this 
case,  the  block  is  received  and  consumed,  but  is  marked  as  possibly  non- authentic. 
Further  blocks  may  again  be  authenticated  normally,  hence  abnormal  network  de¬ 
lays  cause  authentication  “holes” ,  but  no  other  security  problem.  The  choice  of 
the  key  sending  delay  time  is  important:  large  values  of  the  delay  will  reduce  the 
number  of  authentication  holes,  but  will  cause  authentication  decisions  to  be  de¬ 
layed  with  respect  to  data  consumption.  Small  values  of  delay  allow  the  receivers 
to  authenticate  data  shortly  after  it  has  been  played,  but  lead  to  a  risk  of  blocks 
with  “unknown”  authenticity. 

This  protocol  is  robust  to  network  losses,  in  fact  if  a  key  is  lost,  only  the  packet 
authenticated  with  that  key  will  have  unknown  authentication,  while  the  other 
packets  will  be  verified  normally,  because  the  hash  function  can  be  applied  as  many 
times  as  required  to  produce  an  already  autheticated  key  (in  the  extreme  case,  to 
arrive  to  the  signed  key).  Moreover,  if  the  key  is  authentic,  all  the  generated  (but 
previously  lost)  keys  may  be  used  as  if  they  were  correctly  received  to  authenticate 
the  corresponding  data  packets  (see  Figure  5.2).  In  fact,  suppose  that  key  a* +1  is 
authentic  and  that  the  key  a*  is  lost.  If  key  a^i  is  correctly  received,  it  can  be 
verified  with  two  applications  of  h,  i.e.  is  authentic  if  and  only  if  h(/i(a,-_i))  = 

а , -+1,  and  if  this  test  is  true,  from  /i(a,_  1)  we  also  generate  a*  that  is  authentic  by 
definition  and  may  be  used  to  verify  the  corresponding  packet.  If  a  MAC  is  lost, 
only  the  corresponding  packet  will  be  affected  with  an  unknown  authentication. 

б.  THE  MULTICAST  AUTHENTICATION  TOOL 

We  have  developed  an  authentication  tool  called  Multicast  Authentication  Tool 
(MAT)  that  implements  the  CSA  with  time  synchronization  (T-CSA)  presented 


Fig.  2.  The  user  interface  of  the  MAT  authentication  application. 


in  the  previous  section.  MAT  has  been  designed  as  a  separate  module  that  we  inte¬ 
grated  into  the  code  of  the  Robust  Audio  Tool  [Robust  Audio  Tool  ].  The  module 
can  however  be  easily  integrated  in  any  audio/video  application  that  requires  indi¬ 
vidual  authentication.  RAT  is  a  well  known  audio  tool  developed  by  the  University 
College  of  London  Networked  Multimedia  Research  Group  with  the  goal  of  provid¬ 
ing  an  application  that  allows  users  to  participate  in  multicast  conferences  over  the 
MBone.  MAT  uses  parts  of  the  data  structures  of  RAT  and  adds  its  own.  The 
MAT  functions  are  added  to  the  functions  of  RAT  and  are  called  in  various  parts 
of  the  RAT  code.  Ideally,  we  have  an  engine  that  integrates  into  the  engine  of  RAT 
and  a  new  user  interface  that  communicates  with  the  engine.  The  user  interface  is 
shown  in  Figure  6.  In  our  implementation  we  have  a  more  complex  user  interface, 
that  allows  to  control  the  reception  of  the  various  packets  of  the  protocol,  allowing 
experimenting  with  packet  losses,  packet  delaying  and  packet  modification. 

The  Robust  Audio  Tool  sends  the  audio  and  control  packets  according  to  the 
RTP/RTCP  [Schulzrinne  et  al.  1996]  protocol  for  real  time  data  transmission.  The 
RTP/RTCP  traffic  is  sent  on  two  consecutive  transport  ports.  We  use  a  new  trans¬ 
port  port,  the  security  port  next  after  the  RTCP  port,  for  traffic  relative  to  the 
authentication  protocol:  in  this  way  we  do  not  interfere  with  the  RTP/RTCP  traf¬ 
fic  of  RAT.  The  MACs  could  be  inserted  in  the  extension  part  of  the  RTP  packet 
but  given  that  not  all  the  audio  applications  deal  with  the  extensions  (they  simply 
discard  the  packet  if  there  are  extensions)  we  also  sent  the  MAC  as  separate  data 
on  the  security  port. 

MAT  allows  two  types  of  authentication:  RSA,  where  each  audio  packet  has  an 
associated  signature  packet  on  the  security  port  (with  no  time  constraints)  and 
T-CSA. 

All  the  packets  of  our  authentication  protocol  flow  on  the  channel  defined  by  the 
security  port.  The  possible  packets  related  to  the  T-CSA  protocol  are  MAC  packets, 
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where  the  MAC  are  obtained  from  the  chained  keys  of  the  CSA,  signed  keys  that 
sign  the  announcement  and  define  session  numbers  and  the  last  key  of  the  chain, 
and  key  packets,  each  one  containing  a  key  of  the  chain.  When  the  application  runs 
the  RSA  based  protocol,  and  signs  each  audio  packet,  also  the  signature  packets 
are  multicast  on  the  security  port.  Moreover,  when  the  application  is  started,  the 
certificate  of  the  participant  is  sent  on  the  security  port. 

The  T-CSA  implementation  works  in  phases:  each  phase  consumes  k  keys.  In 
each  phase  a  signature  of  [SN,  ST,  hk  (a),  e5]  is  released,  along  with  the  chain  of 
keys.  Each  key  is  released  at  the  correct  instant  of  time  according  to  the  protocol 
in  Section  5.1.  Meanwhile,  the  audio  packets  and  the  corresponding  MACs  are 
sent  as  soon  as  possible.  In  this  way  we  have  an  asymmetric  signature  and  k  MAC 
calculations:  our  implementation  uses  RSA  for  the  signature  and  HMAC  [Krawczyk 
et  al.  1997],  with  SHA-1,  for  the  MAC. 

Given  that  the  keys  are  consumed  from  the  beginning  of  the  transmission  of  the 
session,  a  key  generation  process  builds  a  new  chain  in  such  a  way  that  when  the 
last  key  of  the  old  chain  is  consumed,  the  first  one  of  the  new  chain  is  available. 

6.1  Performance 

The  advantage  of  the  CSA  protocol  is  that  it  uses  a  very  small  authentication 
information  (a  MAC  and  a  key,  of  the  same  size  of  a  MAC,  per  packet)  and  that  this 
information  is  very  fast  to  compute  with  respect  to  an  RSA  signature,  in  fact  CSA, 
after  the  first  message,  requires  to  compute  just  two  hash  values  to  authenticate 
each  message.  RSA  could  be  made  more  efficient  by  simply  using  a  single  RSA 
signature  to  authenticate  many  packets.  Of  course  the  drawbacks  of  this  solution 
are  that  in  case  of  packet  loss,  though,  all  the  packets  under  the  same  signature 
will  result  in  an  “unknown”  authentication.  Moreover,  having  the  authentication 
extend  onto  many  packets  leads  to  a  delayed  knowledge  of  the  authentication  status 
of  the  received  packets:  for  example,  authenticating  five  minutes  of  audio  data  with 
just  one  signature,  means  that  receivers  know  that  what  they  heard  is  authentic 
only  at  the  end  of  the  five  minutes.  This  has  as  the  consequence  that  they  cannot 
trust  what  they  heard  until  the  end  of  the  five  minutes. 

Actually,  the  T-CSA  protocol  may  be  also  implemented  in  multicast  video  con¬ 
ferencing  tools. 

7.  CONCLUSIONS  AND  THE  NON-REPUDIATION  ISSUE 

The  problem  of  individual  authentication  in  multicast  applications  poses  novel  se¬ 
curity  problems.  Proposed  solutions  have  to  deal  with  efficiency  and  scalability 
issues,  and  should  adapt  to  the  evolution  of  protocols  and  the  network  and  trans¬ 
port  levels. 

In  this  paper  we  have  shown  two  possible  solutions  to  the  problem  of  individual 
authentication  in  the  group  communication  context.  Both  are  based  on  the  concept 
of  hash  chain,  linking  the  authentication  information  of  each  data  packet.  These 
solutions  are  very  efficient  because  they  are  based  on  MACs,  and  use  an  asymmetric 
digital  signature  only  once,  to  bootstrap  and  to  authenticate  the  hash  chain.  More¬ 
over,  the  amount  of  authentication  information  does  not  depend  on  the  number  of 
receivers  that  have  to  authenticate  the  data. 

The  first  proposed  solution,  I-CSA,  is  an  interactive  stream  authentication  pro- 
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tocol,  that  is  proved  to  be  secure  against  non-authentic  stream  continuations. 

The  second  protocol,  T-CSA,  is  based  on  the  same  concept  of  hash  chain,  but 
does  not  require  a  continuous  exchange  of  data  among  the  parties;  instead,  it  is 
tailored  for  multicast/broadcast  applications  in  which  the  data  flows  in  one  direc¬ 
tion  only  (from  sender  to  receivers).  Given  that  there  is  no  feedback  from  the 
receivers,  the  security  of  this  protocol  is  based  on  time  synchronization  among 
the  participants,  and  on  the  time  when  the  authentication  information  is  released. 
Nonetheless,  T-CSA  may  be  used  when  there  is  continuous  exchange  of  data  among 
the  participants. 

Due  to  the  use  of  MACs,  the  proposed  schemes  have  shown  to  be  very  efficient, 
and  the  implementation  of  T-CSA  in  a  widely  used  audio  application  has  demon¬ 
strated  that  the  overhead  of  this  protocol  is  compatible  with  the  load  of  a  machine 
performing  multimedia  processing  in  the  context  of  a  teleconference. 

Our  last  consideration  is  in  order.  In  a  recent  paper  [Boneh  et  al.  2000],  Boneh 
et  al.  prove  that  any  secure  multicast  MAC  whose  length  is  slightly  less  than 
the  number  of  receivers  can  be  converted  into  a  signature  scheme,  hence  providing 
non-repudiation.  On  the  other  hand,  CSA  is  secure,  its  length  is  independent  of 
the  number  of  receivers,  and  it  is  not  a  signature.  We  believe  that  the  reason  for 
this  apparent  contradiction  is  that  the  formalization  of  Boneh  et  al.  concerns  the 
authentication  of  a  message  M ,  not  a  stream  S.  CSA  does  not  generate  a  single 
authentication  code  for  the  whole  stream,  it  creates  separate  codes  for  every  block, 
and  such  codes  are  released  either  after  receipt  acknowledgements  (I-CSA)  or  based 
on  time  constraints  (T-CSA).  Hence  one  cannot  just  concatenate  the  separated 
codes  to  form  one  MAC  for  the  streams. 

It  is  true,  though,  that  we  do  obtain  non-repudiation  in  front  of  subjects  that 
participate  in  the  protocol.  For  T-CSA  this  means  just  “listening”  to  the  messages. 
This  feature  also  belongs  the  Guy  Fawkes  protocol  [Anderson  et  al.  1998],  that  was 
in  fact  presented  initially  as  a  form  of  “symmetric  signature” . 
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Abstract 

We  present  a  new  efficient  paradigm  for  signing  digital  streams.  The  problem  of  signing 
digital  streams  to  prove  their  authenticity  is  substantially  different  from  the  problem 
of  signing  regular  messages.  Traditional  signature  schemes  are  message  oriented  and 
require  the  receiver  to  process  the  entire  message  before  being  able  to  authenticate  its 
signature.  However,  a  stream  is  a  potentially  very  long  (  or  infinite)  sequence  of  bits 
that  the  sender  sends  to  the  receiver  and  the  receiver  is  required  to  consume  the  received 
bits  at  more  or  less  the  input  rate  and  without  excessive  delay.  Therefore  it  is  infeasible 
for  the  receiver  to  obtain  the  entire  stream  before  authenticating  and  consuming  it. 
Examples  of  streams  include  digitized  video  and  audio  files,  data  feeds  and  applets. 

We  present  two  solutions  to  the  problem  of  authenticating  digital  streams.  The  first 
one  is  for  the  case  of  a  finite  stream  which  is  entirely  known  to  the  sender  (say  a  movie). 
We  use  this  constraint  to  devise  an  extremely  efficient  solution.  The  second  case  is  for  a 
(potentially  infinite)  stream  which  is  not  known  in  advance  to  the  sender  (for  example 
a  live  broadcast).  We  present  proofs  of  security  of  our  constructions.  Our  techniques 
also  have  applications  in  other  areas,  for  example,  efficient  authentication  of  long  files 
when  communication  is  at  a  cost  and  signature-based  filtering  at  a  proxy  server. 


1  Introduction 

Digital  Signatures  (see  [10,  22])  are  the  cryptographic  answer  to  the  problem  of  information 
authenticity.  When  a  recipient  receives  digitally  signed  information  and  she  is  able  to  verify 
the  digital  signature  then  she  can  be  certain  that  the  information  she  received  is  exactly  the 
same  as  what  the  sender  (identified  by  his  public  key)  has  signed.  Moreover,  this  guarantee 
is  non-repudiable,  i.e.,  the  entity  identified  by  the  public  key  cannot  later  deny  having 
signed  the  information.  Thus,  the  recipient  can  hold  the  signer  responsible  for  the  content 
she  receives.  This  distinguishes  digital  signatures  from  message  authentication  codes  (MAC) 
which  allow  the  receiver  to  have  confidence  on  the  identity  of  the  sender,  but  not  to  prove 
to  someone  else  this  fact,  i.e.  MAC’s  are  repudiable. 

However,  current  digital  signature  technology  was  designed  to  ensure  message  authenti¬ 
cation  and  its  straightforward  application  does  not  yield  a  satisfactory  solution  when  applied 
to  information  resources  which  are  not  message-like.  In  this  paper  we  discuss  one  such  type 

*I.B.M.  T.J.Watson  Research  Center,  P.O.Box  704,  Yorktown  Heights,  NY  10598,  U.S.A.  Email: 
{rosario,rohatgi}<5watson. ibm.com  A  preliminary  version  of  this  paper  appeared  in  the  proceedings  of 
CRYPTO’97. 
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of  resource:  streams.  We  point  out  shortcomings  in  several  approaches  (some  of  them  used 
in  practice)  to  tackle  the  problem  of  signing  streams  and  then  present  our  solution  which 
does  not  have  such  shortcomings. 

1.1  Streams  Defined 

A  stream  is  a  potentially  very  long  (infinite)  sequence  of  bits  that  a  sender  sends  to  a 
receiver.  The  stream  is  usually  sent  at  a  rate  which  is  negotiated  between  the  sender 
and  receiver  or  there  may  be  a  demand-response  protocol  in  which  the  receiver  repeatedly 
sends  requests  for  additional  (finite)  amount  of  data.  The  main  feature  of  streams  which 
distinguish  them  from  messages  is  that  the  receiver  must  consume  the  data  it  receives  at 
more  or  less  the  input  rate,  i.e.,  it  can’t  buffer  large  amounts  of  unconsumed  data.  In  fact 
in  many  applications  the  receiver  stores  relatively  very  small  amounts  of  the  stream.  In 
some  cases  the  sender  itself  may  not  store  the  entire  sequence,  i.e.,  it  may  not  store  the 
information  it  has  already  sent  out  and  it  may  not  know  anything  about  the  stream  much 
beyond  of  what  it  has  sent  out. 

There  are  many  examples  of  digital  streams.  Common  examples  include  digitized  video 
and  audio  which  is  now  routinely  transported  over  the  Internet  and  also  to  television  viewers 
via  various  means,  e.g.,  via  direct  broadcast  satellites  and  very  shortly  via  cable,  wireless 
cable,  telephone  lines  etc.  This  includes  both  pre-recorded  and  stored  audio/video  pro¬ 
gramming  as  well  as  live  feeds.  Apart  from  audio/ video,  there  are  also  data  feeds  (e.g., 
news  feeds,  stock  market  quotes  etc)  which  are  best  modeled  as  a  stream.  The  Internet  and 
the  emerging  interactive  TV  industry  also  provide  another  example  of  an  information  re¬ 
source  which  is  best  modeled  as  a  stream,  i.e.,  applets.^A ost  non-trivial  applets  are  actually 
very  large  programs  which  are  organized  into  several  modules.  The  consumer’s  machine 
first  downloads  and  executes  the  startup  module  and  as  the  program  proceeds,  additional 
modules  are  downloaded  and  executed.  Also,  modules  that  are  no  longer  in  use  may  be 
discarded  by  the  consumer  machine.  This  structure  of  applets  is  forced  by  two  factors. 
Firstly,  the  amount  of  storage  available  on  the  consumer  machine  may  be  limited  (e.g.,  in 
the  emerging  interactive  TV  industry  set-top  boxes  have  to  be  cheap  and  therefore  resource 
limited)  .  Secondly  (in  the  case  of  the  Internet),  the  bandwidth  available  to  download  code 
may  be  limited  and  applets  must  be  designed  to  start  executing  as  soon  as  possible.  Also 
it  is  quite  likely  that  some  of  the  more  sophisticated  applets  may  have  data-rich  compo¬ 
nents  generated  on  the  fly  by  the  applet  server.  Therefore  applets  fit  very  nicely  into  the 
demand/response  streams  paradigm. 

Given  the  above  description,  it  is  clear  that  message-oriented  signature  schemes  cannot 
be  directly  used  to  sign  streams  since  the  receiver  cannot  be  expected  to  receive  the  entire 
stream  before  verifying  the  signature.  If  a  stream  is  infinitely  long  (e.g.,  the  24-hours  news 
channel),  then  it  is  impossible  for  the  receiver  to  receive  the  entire  stream  and  even  if  a 
stream  is  finite  but  long  the  receiver  would  have  to  violate  the  constraint  that  the  stream 
needs  to  be  consumed  at  roughly  the  input  rate  and  without  delay. 

1.2  Previous  Solutions  and  their  Shortcomings 

Up  to  the  authors’  knowledge  there  has  been  no  proposed  specific  solution  to  the  problem  of 
signing  digital  streams  in  the  crypto  literature.  One  can  envision  several  possible  solutions, 


2 


some  of  them  axe  actually  proposed  to  be  used  in  practice. 

One  type  of  solution  splits  the  stream  in  blocks.  The  sender  signs  each  individual  block 
and  the  receiver  loads  an  entire  block  and  verifies  its  signature  before  consuming  it.  This 
solution  also  works  if  the  stream  is  infinite.  However,  this  solution  forces  the  sender  to 
generate  a  signature  for  each  block  of  the  stream  and  the  receiver  to  verify  a  signature  for 
each  block.  With  today’s  signature  schemes  either  one  or  both  of  these  operations  can  be 
very  expensive  computationally.  Which  in  turns  means  that  the  operations  of  signing  and 
verifying  can  create  a  bottleneck  to  the  transmission  rate  of  the  stream. 

Another  type  of  solution  works  only  for  finite  streams  which  are  known  in  advance.  In 
this  case,  once  again  the  stream  is  split  into  blocks.  Instead  of  signing  each  block,  the 
sender  creates  a  table  listing  cryptographic  hashes  of  each  of  the  blocks  and  signs  this 
table.  When  the  receiver  asks  for  the  authenticated  stream,  the  sender  first  sends  the 
signed  table  followed  by  the  stream.  The  receiver  first  receives  and  stores  this  table  and 
verifies  the  signature  on  it.  If  the  signature  matches  then  the  receiver  has  the  authenticated 
cryptographic  hash  of  each  of  blocks  in  the  stream  and  thus  each  block  can  be  verified  when 
it  arrives.  The  problem  with  this  solution  is  that  it  requires  the  storage  and  maintenance  of 
a  potentially  very  large  table  on  the  receiver’s  end.  In  many  realistic  scenarios  the  receiver 
buffer  is  very  limited  compared  to  the  size  of  the  stream,  (e.g.,  in  MPEG  a  typical  movie 
may  be  20  GBytes  whereas  the  receiver  buffer  is  only  required  to  be  around  250Kbytes). 
Therefore  the  hash  table  can  itself  become  fairly  large  (e.g.,  50000  entries  in  this  case  or 
800Kbytes  for  the  MD5  hash  function)  and  it  may  not  be  possible  to  store  the  hash  table 
itself.  Also,  the  hash  table  itself  needs  to  be  transmitted  first  and  if  it  is  too  large  then  there 
will  be  a  significant  delay  before  the  first  piece  of  the  stream  is  received  and  consumed.  To 
address  the  problem  of  large  tables  one  can  also  come  up  with  a  hybrid  scheme  in  which 
the  stream  is  split  in  consecutive  pieces  and  each  piece  is  preceded  by  a  small  signed  table 
of  contents.1 

The  above  solution  can  be  further  modified  by  using  an  authentication  tree:  the  blocks 
are  placed  as  the  leaves  of  a  binary  tree  and  each  internal  node  takes  as  a  value  the  hash 
of  its  children  (see  [17].)  This  way  the  sender  needs  to  sign  and  send  only  the  root  of  this 
tree.  However,  in  order  to  authenticate  each  following  block  the  sender  has  to  send  the 
whole  authentication  path  (i.e.,  the  nodes  on  the  path  from  the  root  to  the  block,  plus  their 
siblings)  to  the  receiver.  This  means  that  if  the  stream  has  k  blocks,  the  authentication 
information  associated  with  each  block  will  be  O(logfc). 

As  we  will  see  briefly  our  solution  eliminates  all  these  shortcomings.  The  basic  idea  works 
for  both  infinite  and  finite  streams,  only  one  expensive  digital  signature  is  ever  computed, 
there  are  no  big  tables  to  store,  and  the  size  of  the  authentication  information  associated 
with  each  block  does  not  depend  on  the  size  of  the  stream. 

Digital  Signatures  vs.  Simple  Authentication.  Notice  that  if  the  receiver  were  only 
interested  in  establishing  the  identity  of  the  sender,  a  solution  based  on  MAC  would  suffice. 
Indeed,  once  the  sender  and  receiver  share  a  secret  key,  the  stream  could  be  authenticated 
block  by  block  using  a  MAC  computation  on  it.  Since  MAC’s  are  usually  faster  than 

1This  is  the  case  now  (Java  Developer  Kit  1.1)  for  large  signed  java  applets  which  are  distributed  as  a 
collection  of  Java  archives  (JAR)  where  each  archive  has  a  signed  table  of  hashes  of  contents  and  the  archives 
are  loaded  in  the  order  given  in  the  HTML  page  in  which  the  applet  is  embedded. 
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signatures  to  compute  and  verify,  this  solution  would  not  incur  the  computational  cost 
associated  with  the  similar  signature-based  solution  described  above. 

However,  a  M AC-based  approach  would  not  enjoy  the  non-repudiation  property.  We 
stress  that  we  require  such  property  for  our  solution.  Also  in  order  for  this  property  to  be 
meaningful  in  the  context  of  streams  we  need  to  require  that  each  prefix  of  the  stream  to 
be  non-repudiable.  That  is,  if  the  stream  is  B  =  . . .  where  each  Bi  is  a  block,  we 

require  that  each  prefix  Bi  =  B\ . . .  Bi  be  non-repudiable.  This  rules  out  a  solution  in  which 
the  sender  just  attaches  a  MAC  to  each  block  and  then  signs  the  whole  stream  at  the  end. 

This  is  to  prevent  the  sender  from  interrupting  the  transmission  of  the  stream  before 
the  non-repudiability  property  is  achieved.  Also  it  is  a  guarantee  for  the  receiver.  Consider 
indeed  the  following  scenario:  the  receiver  notices  that  the  applets  she  is  downloading 
are  producing  damages  to  her  machine.  She  interrupts  the  transfer  in  order  to  limit  the 
damage,  but  at  the  same  time  she  still  wants  some  proof  to  bring  to  court  that  the  substream 
downloaded  so  far  did  indeed  come  from  the  sender. 

In  general  non-repudiation  is  crucial  when  the  stream  is  being  sold  as  an  electronic 
merchandise.  With  the  advent  of  music  and  video  distribution  over  the  internet,  it  is  clear 
that  such  transactions  must  be  protected  with  mechanisms  that  allow  the  resolution  of 
disputes  through  non-repudiation. 

There  are  other  reasons  why  a  solution  based  on  digital  signatures  could  be  preferrable  to 
one  based  on  MAC’s.  Consider  for  example  the  case  in  which  content  is  being  broadcasted  to 
a  large  number  of  people  which  changes  over  time.  In  this  scenarion  (even  if  non-repudiation 
is  not  required)  to  simply  sign  the  content  may  end  up  being  the  simplest  and  most  efficient 
solution,  since  it  avoids  problems  of  key  management  among  a  large  number  of  users. 

1.3  Our  Solution  in  a  Nutshell 

Our  solution  makes  some  reasonable/practical  assumptions  about  the  nature  of  the  streams 
being  authenticated.  First  of  all  we  assume  that  it  is  possible  for  the  sender  to  embed 
authentication  information  in  the  stream.  This  is  usually  the  case,  see  Section  8  to  see 
how  to  do  this  in  most  real-world  situations  like  MPEG  video/audio.  We  also  assume  that 
the  receiver  has  a  ’’small”  buffer  in  which  it  can  first  authenticate  the  received  bits  before 
consuming  them.  Finally  we  assume  that  the  receiver  has  processing  power  or  hardware 
that  can  compute  a  small  number  of  fast  cryptographic  checksums  faster  than  the  incoming 
stream  rate  while  still  being  able  to  play  the  stream  in  real-time. 

The  basic  idea  of  our  solution  is  to  divide  the  stream  into  blocks  and  embed  some  authenti¬ 
cation  information  in  the  stream  itself.  The  authentication  information  of  the  ith  block  will 
be  used  to  authenticate  the  ( i  +  l)3t  block.  This  way  the  signer  needs  to  sign  just  the  first 
block  and  then  the  properties  of  this  single  signature  will  “propagate”  to  the  rest  of  the 
stream  through  the  authentication  information.  Of  course  the  key  problem  is  to  perform 
the  authentication  of  the  internal  blocks  fast.  We  distinguish  two  cases. 

In  the  first  scenario  the  stream  is  finite  and  is  known  in  its  entirety  to  the  signer  in 
advance.  This  is  not  a  very  limiting  requirement  since  it  covers  most  of  the  Internet  ap¬ 
plications  (digital  movies,  digital  sounds,  applets).  In  this  case  we  will  show  that  a  single 
hash  computation  will  suffice  to  authenticate  the  internal  blocks.  The  idea  is  to  embed  in 
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the  current  block  a  hash  of  the  following  block  (which  in  turns  includes  the  hash  of  the 
following  one  and  so  on...) 

The  second  case  is  for  (potentially  infinite)  streams  which  are  not  known  in  advance  to 
the  signer  (for  example  live  feeds,  like  sports  event  broadcasting  and  chat  rooms).  In  this 
case  our  solution  is  less  optimal  as  it  requires  several  hash  computations  to  authenticate  a 
block  (although  depending  on  the  embedding  mechanism  these  hash  computations  can  be 
amortized  over  the  length  of  the  block).  The  size  of  the  embedded  authentication  infor¬ 
mation  is  also  an  issue  in  this  case.  The  idea  here  is  to  use  fast  1-time  signature  schemes 
(introduced  in  [15,  16])  to  authenticate  the  internal  blocks.  So  block  i  will  contain  a  1- 
time  public  key  and  also  the  1-time  signature  of  itself  with  respect  to  the  key  contained  in 
block  2  —  1.  This  signature  authenticates  not  only  the  stream  block  but  also  the  1-time  key 
attached  to  it. 

1.4  Related  Work 

The  "chaining”  technique  of  embedding  the  hash  of  the  following  block  in  the  current  block 
can  be  seen  as  a  variation  of  the  Merkle-Damgard  meta-method  to  construct  hash  functions 
based  on  a  simpler  compression  function  [18,  9].  The  novelty  here  is  that  we  exploit  the 
structure  of  the  construction  to  allow  fast  authentication  of  single  blocks  in  sequential  order. 
A  similar  idea  had  been  used  in  the  context  of  time-stamping  mechanisms  by  Haber  and 
Stornetta  [14].  It  can  also  be  seen  as  a  weak  construction  of  accumulators  as  introduced 
in  [4].  An  accumulator  for  k  blocks  B\,  is  a  single  value  ACC  that  allows  a  signer 

to  quickly  authenticate  any  of  the  blocks  in  any  particular  order.  Accumulators  based  on 
the  RSA  assumption  were  proposed  in  [4].  In  our  case  we  have  a  much  faster  construction 
based  on  collision-free  hash  functions,  since  we  exploit  the  property  that  the  blocks  must 
be  authenticated  in  a  specific  order. 

The  proofs  of  security  of  our  schemes  are  somewhat  similar  to  a  proof  by  Damgard  in 
[8]  of  the  general  result  that  combining  a  secure  signature  scheme  and  a  collision  intractable 
hash  function  yields  a  secure  signature  scheme. 

Mixing  "regular”  signatures  with  1-time  signatures,  for  the  purpose  of  improving  ef¬ 
ficiency  is  discussed  in  [12].  However,  in  that  paper  the  focus  is  in  making  the  signing 
operation  of  a  message  M  efficient  by  dividing  it  in  two  parts.  An  off-line  part  in  which  the 
signer  signs  a  1-time  public  key  with  his  long-lived  secret  key  even  before  the  messages  M 
is  known.  Then  when  M  has  to  be  sent  the  signer  computes  a  1-time  signature  of  M  with 
the  authenticated  1-time  public  key  and  sends  out  M  tagged  with  the  1-time  public  key 
and  the  two  signatures.  Notice  that  the  receiver  must  compute  two  signature  verifications: 
one  on  the  long-lived  key  and  one  on  the  1-time  key.  In  our  scheme  we  need  to  make  both 
signing  and  verification  extremely  fast,  and  indeed  in  our  case  each  block  (except  for  the 
first)  is  signed  (and  hence  verified)  only  once  with  a  1-time  key. 

We  also  use  the  idea  of  using  old  keys  in  order  to  authenticate  new  keys.  This  has 
appeared  in  several  places  but  always  for  long-lived  keys.  Examples  include  [2,  20,  24] 
where  this  technique  is  used  to  build  provably  secure  signature  schemes.  We  stress  that  the 
results  in  [2,  20,  24]  are  mostly  of  theoretical  interest  and  do  not  yield  practical  schemes. 
Our  on-line  solution  somehow  mixes  these  two  ideas  in  a  novel  way,  by  using  the  chaining 
technique  with  1-time  keys,  embedding  the  keys  inside  the  stream  flow  so  that  old  keys  can 
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authenticate  at  the  same  time  both  the  new  keys  and  the  current  stream  block. 

2  Recent  Developments 

Subsequent  to  the  publication  of  a  preliminary  version  of  this  work  in  CRYPTO  ’97,  there 
has  been  a  lot  of  research  activity  in  the  area  of  source  authentication  techniques  for  mul¬ 
ticast  communications,  some  of  which  is  applicable  to  the  problem  of  stream  signing.  Au¬ 
thenticating  the  sender  of  data  packets  is  a  fundamental  security  issue  in  multicast,  and 
to  date,  no  completely  satisfactory  solution  is  known  for  this  problem.  Techniques  such  as 
the  use  of  MACs,  which  work  well  for  unicast  settings,  are  completely  insecure  in  multiple 
receiver  settings  and  much  research  has  focussed  on  the  design  of  efficient  signature  schemes 
for  multicast  packet  authentication.  At  first  glance,  it  appears  that  the  stream  signing  work 
presented  in  this  paper  should  be  applicable  to  this  problem.  However,  it  turns  out  that 
on  the  Internet,  most  multicast  data  travels  over  an  unreliable  transport  and  thus  the  tech¬ 
niques  developed  in  this  paper  are  not  directly  applicable  since  they  assume  reliability.  On 
the  other  hand,  some  of  the  efficient  signature  mechanisms  developed  for  multicast  packet 
authentication  [26,  23]  could  be  used  in  conjunction  with  the  techniques  presented  in  this 
paper  to  yield  even  better  stream  signing  techniques. 

3  Preliminaries 

In  the  following  we  denote  with  n  the  security  parameter.  We  say  that  a  function  e(n)  is 
negligible  if  for  all  c,  there  exists  an  no  such  that,  for  all  n  >  no,  e(n)  <  l/nc. 

Collision-Resistant  Hash  Functions.  Let  H  be  a  family  of  functions  that  map  ar¬ 
bitrarily  long  binary  strings  into  binary  strings  of  a  fixed  length  k.  We  say  that  K  is  a 
collision-resistant  family  of  hash  functions  if  any  polynomial  time  algorithm  who  is  given 
as  input  a  description  of  a  random  element  H  E  finds  a  collision,  i.e.,  a  pair  (cc,  t/)  such 
that  x^y  and  ff(a:)  =  H  (y),  only  with  negligible  probability  e(fc). 

MD5  [21]  and  SHA-1  [19]  are  conjectured  collision-resistant  hash  functions,  i.e.,  random 
representatives  of  a  family  ii  with  the  above  property. 

Signature  Schemes.  A  signature  scheme  is  a  triplet  (G,  S,  V)  of  probabilistic  polynomial¬ 
time  algorithms  satisfying  the  following  properties: 

•  G  is  the  key  generation  algorithm.  On  input  ln  it  outputs  a  pair  {SK,  PK)  E  {0,  l}2n. 
SK  is  called  the  secret  (signing)  key  and  PK  is  called  the  public  (verification)  key. 

•  S  is  the  signing  algorithm.  On  input  a  message  M  and  the  secret  key  SK ,  it  outputs 
a  signature  a. 

•  V  is  the  verification  algorithm.  For  every  {PK,  SK)  =  G(ln)  and  o  -  S{SK,M ),  it 
holds  that  V{PK,a,M)  -  1. 

In  [13]  security  for  signature  schemes  is  defined  in  several  variants.  The  strongest  variant 
is  called  “existential  unforgeability  against  adaptively  chosen  message  attack”.  That  is,  we 
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require  that  no  efficient  algorithm  will  be  able  to  produce  a  valid  signed  message,  even  after 
seeing  several  signed  messages  of  its  choice. 

One-Time  Signatures.  A  special  kind  of  signature  schemes  satisfy  the  [13]  definition 
of  security  only  if  we  allow  the  adversary  to  see  a  limited  number  of  signed  messages.  In 
particular  there  exists  signature  schemes  that  are  secure  only  if  used  to  sign  a  single  message. 
The  main  advantage  of  this  type  of  schemes  is  that  they  are  usually  much  faster  to  execute 
than  regular  signature  schemes. 

Stream  Signatures.  We  define  a  stream  to  be  a  (possibly  infinite)  sequence  of  blocks 
B  =  #i,  £2,  •  •  •  where  each  Bi  E  {0, 1}C  for  some  constant2  c. 

We  distinguish  two  cases.  In  the  first  case  we  assume  that  the  stream  is  finite  and  known 
to  the  sender  in  advance.  We  call  this  the  off-line  case.  Conversely  in  the  on-line  case  the 
signer  must  process  one  (or  a  few)  block  at  the  time  with  no  knowledge  of  the  future  part 
of  the  stream. 

Definition  1  An  off-line  stream  signature  scheme  is  a  triplet  (G,S,V)  of  probabilistic 
polynomial-time  algorithms  satisfying  the  following  properties: 

•  G  is  the  key  generation  algorithm.  On  input  ln  it  outputs  a  pair  ( SK ,  PK)  E  {0,  l}2n. 
SK  is  called  the  secret  (signing)  key  and  PK  is  called  the  public  (verification)  key . 

•  S  is  the  signing  algorithm.  On  input  a  finite  stream  B  =  B\, . . .  ^B^  and  the  secret 

key  SK  algorithm  S  outputs  a  new  stream  B'  =  B’k  where  B[  =  (Bi,  A*). 

•  V  is  the  verification  algorithm.  For  every  ( PK,SK )  =  G(ln)  and  B>  =  S(SK1B)J  it 
holds  that  V(PK ,  B[, . . . ,  Bi)  =  1  for  1  <  i  <  k. 

Notice  that  we  modeled  the  off-line  property  by  the  fact  that  the  signing  algorithm  is  given 
the  whole  stream  in  advance.  Yet  the  verifier  is  required  to  authenticate  each  prefix  of 
the  scheme  without  needing  to  see  the  rest  of  the  stream.  As  it  will  become  clear  in  the 
following  our  algorithms  will  not  require  the  off-line  verifier  to  store  the  whole  past  stream 
either. 

Definition  2  An  on-line  stream  signature  scheme  is  a  triplet  (G,  S',  V)  of  probabilistic 
polynomial-time  algorithms  satisfying  the  following  properties: 

•  G  is  the  key  generation  algorithm.  On  input  ln  it  outputs  a  pair  (SK^PK)  E  {0,  l}2n. 
SK  is  called  the  secret  (signing)  key  and  PK  is  called  the  public  (verification)  key. 

•  S  is  the  signing  algorithm.  Given  a  (possibly  infinite)  stream  B  =  algorithm 

S  with  input  the  secret  key  SK  process  each  block  one  at  the  time ,  i.e., 

S{SK,Bu...,Bi)  =  B'i  =  {Bi,Ai) 

•  V  is  the  verification  algorithm.  For  every  (PK^SK)  =  G(ln)  and  B^B^...  such 
that  BI  =  S{SK ,  Bi , . . . ,  B{)  for  all  i,  it  holds  that  V ( PK ,  B[ , . . . ,  B[)  =  1  for  all  i. 

2The  assumption  that  the  blocks  have  all  the  same  size  is  not  really  necessary.  We  just  make  it  for  clarity 
of  presentation. 
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Notice  that  in  the  on-line  definition  we  have  the  signer  process  each  block  ”on  the  fly”  so 
knowledge  of  future  blocks  is  not  needed.  In  this  case  also  the  definition  seems  to  requires 
knowledge  of  all  past  blocks  for  both  signer  and  verifier,  however  this  does  not  have  to  be 
the  case  (indeed,  in  our  solution  some  past  blodcs  may  be  discarded). 

The  above  definitions  say  nothing  about  security.  In  order  to  define  security  for  stream 
signing  we  use  the  same  notions  of  security  introduced  in  [13].  That  is,  we  require  that  no 
efficient  algorithm  will  be  able  to  produce  a  valid  signed  stream,  even  after  seeing  several 
signed  streams.  However,  notice  that  given  our  definition  of  signed  streams,  a  prefix  of  a 
valid  signed  stream  is  itself  a  valid  signed  stream.  So  the  forger  can  present  a  ” different” 
signed  stream  by  just  taking  a  prefix  of  the  ones  seen  before.  However,  this  hardly  consti¬ 
tutes  forgery,  so  we  rule  it  out  in  the  definition.  With  B W  C  B<2)  we  denote  the  fact  that 
B W  is  a  prefix  of  B&h 

Definition  3  We  say  that  an  off-line  (resp.  on-line)  stream  signature  scheme  (GjS^V)  is 
secure  if  any  probabilistic  polynomial  time  algorithm  F,  given  as  input  the  public  key  PK 
and  adaptively  chosen  signed  streams  Bf^  for  j  =  1,2, . . outputs  a  new  previously  unseen 
valid  signed  stream  Bf  £  5^’)  V?  only  with  negligible  probability. 

For  signed  streams  we  slightly  abuse  the  notation:  when  we  write  B'W  g  S'®  we  mean 
that  not  only  Br ^  is  not  a  prefix  of  Bf ^  but  also  the  underlying  "basic”  unsigned  streams 
are  in  the  same  relationship,  i.e.,  B W  <f.  B&\ 

This  is  the  definition  of  existential  unforgeability  against  adaptively-chosen  message  attack, 
the  strongest  of  the  notions  presented  in  [13].  Following  [13]  weaker  variants  can  be  defined. 
Notice  that  the  adversary  is  polynomially  bounded,  thus  he  can  query  only  a  bounded 
number  of  streams  whose  total  length  will  also  be  bounded  by  a  polynomial. 

4  The  Off-Line  Solution 

In  this  case  we  assume  that  the  sender  knows  the  entire  stream  in  advance,  (e.g.,  mu¬ 
sic/movie  broadcast).  Assume  for  simplicity  that  the  stream  is  such  that  it  is  possible  to 
reserve  20  bytes  of  extra  authentication  information  in  a  block  of  size  c. 

The  stream  is  logically  divided  into  blocks  of  size  c.  The  receiver  has  a  buffer  of  size  c. 
The  receiver  first  receives  the  signature  on  the  20  byte  hash  (e.g.,  SHA-1)  of  the  first  block. 
After  verification  of  the  signature  the  receiver  knows  what  the  hash  of  the  first  block  should 
be  and  then  starts  receiving  the  full  stream  and  starts  computing  its  hash  block  by  block. 
When  the  receiver  receives  the  first  block,  it  checks  its  hash  against  what  the  signature  was 
verified  upon.  If  it  matches,  it  plays  the  block  otherwise  it  rejects  it  and  stops  playing  the 
stream.  How  are  other  blocks  authenticated?  The  key  point  is  that  the  first  block  contains 
the  20  byte  hash  of  the  second  block,  the  second  block  contains  the  20  byte  hash  of  the 
third  block  and  so  on...  Thus,  after  the  first  signature  check,  there  are  just  hashes  to  be 
checked  for  every  subsequent  block. 

In  more  detail,  let  (G,  5,  V)  be  a  regular  signature  scheme.  The  sender  has  a  pair  of 
secret-public  key  ( SK,PK )  =  G(ln)  of  such  signature  scheme.  Also  let  H  be  a  collision- 
resistant  cryptographic  hash  function.  If  the  original  stream  is 

B  =  Hi,J52,...  }Bk 
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and  the  resulting  signed  stream  is 

the  processing  is  done  backwards  on  the  original  stream  as  follows: 

B'k=<Bk,  00. ..0> 

Bl  =<  Bi,H(B'i+1)  >  for  i  =  -  1 

B'0  =<  H(B'uk),S(SK,H(B[,k))  > 

Notice  that  on  the  sender  side,  computing  the  signature  and  embedding  the  hashes  requires 
a  single  backwards  pass  on  the  stream,  hence  the  restriction  that  the  stream  is  fully  known 
in  advance.  Notice  also  that  the  first  block  B'0  of  the  signed  stream  contains  an  encoding 
of  the  length  of  the  stream  (A:). 

The  receiver  verifies  the  signed  stream  as  follows:  on  receiving  B'0  =<  B,Aq  >  she 
checks  that 

V{PK,A0,B)  =  1 

and  extracts  the  length  k  in  blocks  of  the  stream  (which  we  may  assume  is  encoded  in  the 
first  block).  Then  on  receiving  B\  =<  Bi,At>  (for  1  <  i  <  k)  the  receiver  accepts  B{  if 

H{B\)  =  Ai-i 

Thus  the  receiver  has  to  compute  a  single  public-key  operation  at  the  beginning,  and  then 
only  one  hash  evaluation  per  block.  Notice  that  no  big  table  is  needed  in  memory. 

5  The  On-Line  Solution 

In  this  case  the  sender  does  not  know  the  entire  stream  in  advance  (e.g,  live  broadcast).  In 
this  scenario  it  is  important  that  also  the  operation  of  signing  (and  not  just  verification)  be 
fast,  since  the  sender  himself  is  bound  to  produce  an  authenticated  stream  at  a  potentially 
high  rate. 

One-Time  Signatures.  In  the  following  we  will  use  a  special  kind  of  signature  scheme 
introduced  in  [15,  16].  These  axe  signatures  which  are  much  faster  to  compute  and  verify 
than  regular  signatures  since  they  are  based  on  one-way  functions  and  do  not  require  a 
trapdoor  function.  Conjectured  known  one-way  functions  (as  DES  or  SHA-1)  are  much  more 
efficient  than  the  known  conjectured  trapdoor  functions  as  ESA.  However,  these  schemes 
cannot  be  used  to  sign  an  arbitrary  number  of  messages  but  only  a  prefixed  number  of  them 
(usually  one).  Several  other  1-time  schemes  have  been  proposed  [12,  6,  7];  in  Section  7  we 
discuss  possible  instantiations  for  our  purpose. 

In  this  case  also  the  stream  is  split  into  blocks.  Initially  the  sender  sends  a  signed  public  key 
for  a  1-time  signature  scheme.  Then  he  sends  the  first  block  along  with  a  1-time  signature 
on  its  hash  based  on  the  1-time  public  key  sent  in  the  previous  block.  The  first  block  also 
rantaing  a  new  1-time  public  key  to  be  used  to  verify  the  signature  on  the  second  block  and 
this  structure  is  repeated  in  all  the  blocks. 
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More  in  detail:  let  ns  denote  with  (G,  S',  V)  a  regular  signature  scheme  and  with  (p,  s,  v) 
a  1-time  signature  scheme.  With  H  we  still  denote  a  collision-resistant  hash  function.  The 
sender  has  long-lived  keys  ( SK,PK )  =  G(ln).  Let 

B  =  jBi, 

be  the  original  stream  (notice  that  in  this  case  we  are  not  assuming  the  stream  to  be  finite) 
and 

the  signed  stream  constructed  as  follows.  For  each  i  >  1  let  us  denote  with  ( ski,pki )  =  p(ln) 
the  output  of  an  independent  run  of  algorithm  g .  Then 

Bf0  =<pfc05Sr(5A’,pfco)  > 

(public  keys  of  1-time  signature  schemes  are  usually  short  so  they  need  not  to  be  hashed 
before  signing) 

B[  =<  Biipki,  s(sfcj_i, H(Bi,pki))  >  for  i  >  1 

Notice  that  apart  from  a  regular  signature  on  the  first  block,  all  the  following  signatures 
are  1-time  ones,  thus  much  faster  to  compute  (including  the  key  generation,  which  does  not 
have  to  be  done  on  the  fly.) 

The  receiver  verifies  the  signed  stream  as  follows.  On  receiving  Bq  =<  pk0,A0  >  she 
checks  that 

V(PK,Ao,pko)  =  1 

and  then  on  receiving  B[  =<  Bupki^\,Ai  >  she  checks  that 

v{pki^Ai,H{Bi,pki))  =  1 

whenever  one  of  these  checks  fails,  the  receiver  stops  playing  the  stream.  Thus  the  receiver 
has  to  compute  a  single  public-key  operation  at  the  beginning,  and  then  only  one  1-time 
signature  verification  per  block. 

6  Proofs  of  Security 

We  are  able  to  prove  the  security  of  our  stream  signature  schemes  according  to  the  definitions 
presented  in  Section  3,  provided  that  the  underlying  components  used  to  build  the  schemes 
are  secure  on  their  own. 

The  Off-Line  Case.  Let  us  denote  with  ({/<?//,  V0//)  the  off-line  stream  signature 

scheme  described  in  Section  4.  With  (G,  5,  V)  let  us  denote  the  “regular”  signature  scheme 
and  with  H  the  hash  function  used  in  the  construction.  The  following  holds. 

Theorem  1  If  (G,  5,  V)  is  a  secure  signature  scheme  and  H  is  a  collision-resistant  hash 
function  then  the  resulting  stream  signature  scheme  {Goff,$0ff,  V0//)  is  secure. 
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Proof 

Assume  the  thesis  is  false,  i.e.,  that  there  is  an  algorithm  T  that  succeeds  in  an  ex¬ 
istential  forgery  against  {G0ff,$0ff,  VD//)  using  an  adaptively-chosen  message  attack  with 
non-negligible  probability  e.  That  is,  T  runs  on  input  PK,  adaptively  asks  for  the  signed 
versions  of  £  streams  B^l\ . . . ,  B$  where  B ^  =  B[j^ . . .  B ^  and  receives  them;  let  them 
be  Bf^\ . . . ,  BrW.  Then  T  outputs  a  valid  signed  stream  B!  where  Br  =  Bq  . . .  Brk  is  not  a 
prefix  of  any  of  the  previous  ones,  i.e.,  &  %  B'W  Vj  =  1, . . .  ,£.  One  of  the  following  two 
cases  must  be  true. 

Case  Is  With  probability  at  least  c/2,  T  outputs  a  signed  stream  whose  first  block  coincides 
with  the  first  block  of  one  of  the  signed  streams  it  asked  for  before.  That  is  there  exists  a  j 
such  that  Bq  —  B'qK  We  show  in  this  case  how  T  can  be  used  to  build  an  algorithm  Tc 
which  finds  collisions  for  H. 

Tc  runs  as  follows.  It  first  runs  G  to  obtain  a  pair  of  public  and  secret  key  (. PK ,  SK). 
Then  it  runs  T  and  uses  SK  to  sign  all  the  requests  B^  for  j  =  1, . . .  ,£  that  T  makes. 
Now  with  probability  at  least  e/2  T  outputs  Bf  %  B'^  for  all  j,  but  such  that  there  exists  j 
such  that  Bq  =  Let  k  be  the  length  of  the  stream  B*  and  kj  the  length  of  the  stream 

B'O). 

If  k  ^  kj  then  (B[,k)  ^  (B'^\kj).  But  since  Bq  =  B 7^  it  must  be  that  H(B[,k)  = 
H(B^\kj).  So  Tc  can  output  the  pair  (B[,  fc),  (B7^,  kj)  as  a  collision  for  H. 

If  k  —  kj ,  then  since  &  %  Bf^  there  must  exist  an  0  <  a  <  k  such  that  B*Q  ^  B7^ 
while  Bp  =  B^p  for  all  0  <  <  a.  That  is  we  have  that  B^_x  =  B7^,  which  (by  our 

construction  of  signed  streams)  implies  that  =  H(B’P).  So  Tc  outputs  the  pair 

Bra,  B'P  as  a  collision  for  H. 

Case  2:  With  probability  at  least  c/2,  T  outputs  a  signed  stream  whose  first  block  is  different 
from  the  first  block  of  any  of  the  signed  streams  it  asked  for  before .  That  is,  for  all  j  we 
have  Bq  ^  Bfp .  We  show  in  this  case  how  T  can  be  used  to  build  an  algorithm  Ts  which 
forges  signatures  for  the  regular  signature  scheme  (G,  S',  V). 

Ts  runs  on  input  a  public  key  PK  of  the  signature  scheme  (G,  S,  V).  It  is  also  allowed 
to  query  a  signature  oracle  to  get  signed  messages  of  its  choice.  Ts  starts  by  running  T . 
When  the  latter  requests  a  signature  on  a  stream  B^  =  B^  . . .  Ts  first  prepares  the 

last  kj  blocks  of  the  signed  stream  B!^\ . . . ,  B7j^  and  then  it  uses  its  allowed  queries  to 

the  signature  oracle  to  compute  B’P  —<  H(B,P),S(SK,H(BfP))  >. 

Now  with  probability  at  least  c/2  T  outputs  B7  =  B*0 . . .  Bfk  such  that  Bq  ^  B7^  for  all 
j .  This  in  turn  means  that  Bq  is  a  message/ signature  pair  that  Ts  has  not  asked  before  to 
the  signature  oracle.  Ts  stops  outputting  such  pair  as  the  forged  signature. 

Since  e/2  is  a  non-negligible  probability,  either  Case  1  or  Case  2  contradicts  our  hypothesis, 
so  the  thesis  must  be  true. 


The  On-Line  Case.  Let  us  denote  with  ( Qon ,  Son,  Von)  the  on-line  stream  signature  scheme 
described  in  Section  5.  With  (G,  S,  V)  let  us  denote  the  “regular”  signature  scheme,  with 
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(<?,  s,  v)  the  one-time  signature  scheme  and  with  H  the  hash  function  used  in  the  construc¬ 
tion.  The  following  holds. 

Theorem  2  If  {G^S^V)  and  (#,  s,v)  ore  secure  signature  schemes  and  H  is  a  collision- 
resistant  hash  function  then  the  resulting  stream  signature  scheme  (Gon,S<miV<m)  is  secure. 

Proof 

Assume  the  thesis  is  false,  i.e.,  that  there  is  an  algorithm  P  that  succeeds  in  an  exis¬ 
tential  forgery  against  {Gon^Son)  V^)  using  an  adaptively-chosen  message  attack  with  non- 
negligible  probability  e.  That  is  P  runs  on  input  PK,  adaptively  asks  for  the  signed 
versions  oil  streams3  ,5®  where  B V)  =  Bp\..  and  receives  them;  let  them  be 

B^l\ . . . ,  Then  P  outputs  a  valid  signed  stream  &  where  &  =  BJ,  . . .  which  is  not  a 
prefix  of  any  of  the  previous  ones,  i.e.,  &  %  B1^  V;  =  1  One  of  the  following  cases 

must  be  true. 

Case  Is  With  probability  at  least  e/2,  P  outputs  a  signed  stream  whose  first  block  is  different 
from  the  first  block  of  any  of  the  signed  streams  it  asked  for  before.  That  is,  for  all  j  we 
have  Bq  /  B'^.  We  show  in  this  case  how  P  can  be  used  to  build  an  algorithm  T\  which 
forges  signatures  for  the  regular  signature  scheme  (G,  S',  V). 

Pi  runs  on  input  a  public  key  PK  of  the  signature  scheme  (G,  S,  V).  It  is  also  allowed 
to  query  a  signature  oracle  to  get  signed  messages  of  its  choice.  Pi  starts  by  running  P, 
When  the  latter  requests  a  signature  on  a  stream  B^  —  b[^  ,,.yPs  prepares  the  first  block 
of  the  signed  stream  B'^  by  running  g  to  get  (pk^\sk^)  and  then  queries  the  signature 
oracle  to  get  a  signature  on  pk$.  The  remaining  blocks  of  the  stream  can  be  easily  prepared 
by  P\  by  running  the  1-time  key  generation  algorithm  g  as  needed 

Now  with  probability  at  least  e/2  P  outputs  Bf  =  Bf0...  such  that  B'0  ^  B'^  for  all  j. 
This  in  turn  means  that  the  Bf0  is  a  message/signature  pair  for  (G,  5,  V)  that  Pi  has  not 
asked  before  to  the  signature  oracle.  Pi  stops  outputting  such  pair  as  the  forged  signature. 

Case  2:  With  probability  at  least  e/2,  p  outputs  a  signed  stream  whose  first  block  coincides 
with  the  first  block  of  one  of  the  signed  streams  it  asked  for  before.  That  is  there  exists  a  j 
such  that  Bq  —  B*q  \  However,  we  also  know  that  B'  %.  B'^\  This  implies  that  there  exists 
an  0  <  a  <  k  such  that  Bfa  ^  B'^  while  Bfa_i  =  B'^li-  Recall  that,  by  our  construction 
of  signed  streams,  we  have 

Ba  —  K.  Ba, pka,  s{ska~. i, H(Baipka))  > 

B'f  =<  B(J\pk^,s(sk^H(B^\pk^))  > 

We  know  that  Ba  ^  Bf.  One  of  this  two  subcases  must  be  true. 

Case  2a:  With  probability  at  least  e/4  the  output  of  P  is  such  that 

H{Ba,pka)  =  H(Bf,pkf). 

We  show  how  to  construct  an  algorithm  P<i  that  computes  collisions  for  H. 

3 We  may  assume  that  the  forger  adaptively  chooses  the  components  of  the  stream,  i.e.,  after  seeing  the 
tth  block  signed  it  creates  the  (i  +  l)ai  block  to  be  signed 
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f2  runs  as  follows.  It  first  runs  Q  to  obtain  a  pair  of  public  and  secret  key  ( PK ,  SK). 
Then  it  runs  T.  When  the  latter  asks  to  sign  a  specific  stream,  T2  uses  SK  to  sign  the  first 
block  and  then  generates  ”on  the  fly”  one-time  keys  (using  g)  to  sign  all  the  other  blocks  of 
the  given  stream.  Then  ?2  looks  at  the  output  of  T  which  by  assumption  has  the  property 
that 

H(Ba,pka)  =  H(BV\pkW) 

and  since  we  know  that  Ba  Ba  \  Ti  has  found  a  collision. 

Case  2b:  With  probability  at  least  e/4  the  output  of  T  is  such  that 

H(BaiPka)  #  H{B<p,pk<p). 

We  show  how  to  construct  an  algorithm  Tz  that  forges  signatures  in  the  1-time  scheme 
(g,s,v). 

Let  us  denote  with  K  an  upper  bound  on  the  total  number  of  stream  blocks  that  T  asks 
during  its  attempt  at  forgery.  Without  loss  of  generality  let’s  assume  that  T  always  asks 
K  blocks.  Clearly  T  is  polynomial  in  our  security  parameter  n. 

works  as  follows.  It  runs  on  input  a  1-time  key  pk  and  it  can  ask  one  query  to  get  a 
single  message  signed  by  the  corresponding  secret  key  sk.  Its  goal  is  to  output  a  different 
message  and  its  signature  under  sk. 

Tz  runs  as  follows.  It  runs  G  to  obtain  a  long-lived  key  pair  PK ,  SK  of  (G,  S,  V).  It 
then  runs  g  several  times  in  order  to  obtain  several  one-time  key  pairs.  Finally  it  selects 
uniformly  at  random  an  integer  i  between  1  and  K  —  1. 

JF3  starts  running  T.  Whenever  the  latter  asks  for  a  signed  stream,  Tz  can  sign  the 
first  block  since  it  knows  SK  and  it  uses  the  generated  1-time  keys  for  the  internal  blocks. 
The  exception  is  when  T  asks  for  the  ith  block  (sequentially).  In  this  case  Tz  uses  pk 
as  the  1-time  public  key  embedded  in  the  ith  block.  This  means  that  when  T  asks  for  the 
(i  +  l)st  block,  Tz  has  to  query  the  signature  oracle  in  order  to  compute  the  1-time  signature 
embedded  in  it. 

At  the  end  of  this  process,  T  stops  outputting  a  signed  stream  with  the  properties 
outlined  above.  With  probability  1/K  the  block  Br^_x  is  the  one  in  which  Tz  used  the 
target  public  key  pk.  This  in  turn  means  that  the  1-time  signature  contained  in  the  block 

outputted  by  T  is  valid  under  the  key  pk,  yet  it  is  on  a  different  message  than  the  one 
queried  by  Tz- 

So  with  probability  e/4 K,  which  is  still  non-negligible,  the  forger  Tz  succeeds. 


Remark:  The  statements  of  the  above  theorems  are  valid  not  only  in  asymptotic  terms, 
but  have  also  a  concrete  interpretation  which  ultimately  is  reflected  in  the  key  lengths 
used  in  the  various  components  in  order  to  achieve  the  desired  level  of  security  of  the  full 
construction.  It  is  not  hard  to  see,  by  a  close  analysis  of  the  proofs,  that  the  results  are 
pretty  tight.  That  is,  a  forger  for  the  stream  signing  scheme  can  be  transformed  into  an 
attacker  for  one  of  the  components  (the  hash  function,  the  regular  signature  scheme  and, 
a  little  less  optimally,  the  1-time  signature  scheme)  which  runs  in  about  the  same  time, 
asks  the  same  number  of  queries  and  has  almost  the  same  success  probability.  This  is  turns 
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means  that  there  is  no  major  degradation  in  the  level  of  security  of  the  compound  scheme 
and  thus  the  basic  components  can  be  run  with  keys  of  ordinary  length. 

More  precisely:  for  the  off-line  scheme,  if  we  have  a  forger  T  that  runs  in  time  T,  asks 
for  q  signed  streams  of  total  length  K  and  succeeds  with  probability  c  then  we  have  either 
a  collision-finder  Tc  for  'H  or  a  forger  Ta  for  the  regular  signature  scheme  which  run  in  time 
T,  ask  q  signature  queries  and  qK  hashing  queries  and  succeed  with  probability  e/2. 

For  the  on-line  scheme,  if  we  have  a  forger  T  that  runs  in  time  T,  asks  for  q  signed 
streams  of  total  length  K  and  succeeds  with  probability  e  then  we  have 

•  either  a  forger  T\  for  the  regular  signature  scheme  which  runs  in  time  T  -f  Kt ,  asks  q 
signature  queries  and  qK  hashing  queries  and  succeed  with  probability  e/2. 

•  or  a  collision-finder  for  K  which  runs  in  time  T  +  Kt,  asks  q  signature  queries  and 
qK  hashing  queries  and  succeed  with  probability  e/4. 

•  or  a  forger  for  the  one-time  signature  scheme  which  runs  in  time  T  +  Kt,  asks  q 
signature  queries  and  qK  hashing  queries  and  succeed  with  probability  e/4 K. 

where  t  is  the  time  required  to  run  the  one-time  signature  scheme  (key  generation  and 
signature)  algorithms. 

7  Implementation  Issues 

7.1  The  Choice  of  the  One-Time  Signature  Scheme 

Several  one-time  schemes  have  been  presented  in  the  literature,  see  for  example  [15,  16,  12, 
6,  7].  The  main  parameters  of  these  schemes  are  signature  length  and  verification  time. 
In  the  solutions  we  know  of,  these  parameters  impose  conflicting  requirements,  i.e.,  if  one 
wants  a  scheme  with  short  signatures,  verification  time  goes  up,  while  schemes  with  longer 
signatures  can  have  a  much  shorter  verification  time.  In  our  on-line  solution  we  would  like 
to  keep  both  parameters  down.  Indeed,  the  verification  should  be  fast  enough  to  allow  the 
receiver  to  consume  the  stream  blocks  at  the  same  input  rate  she  receives  them.  At  the 
same  time,  since  the  signatures  are  embedded  in  the  stream,  it’s  important  to  keep  them 
small  so  that  they  will  not  reduce  the  throughput  rate  of  the  original  stream. 

We  first  suggest  a  scheme  which  obtain  a  reasonable  compromise.  The  scheme  is  based 
on  a  1-way  function  F  in  a  domain  D.  It  also  uses  a  collision  resistant  hash  function  H . 
The  scheme  allows  signing  of  a  single  m-bit  message.  It  is  based  on  a  combinations  of  ideas 
from  [15,  25].  Here  are  the  details  of  the  scheme. 

Key  Generation.  Choose  m  4-  logm  elements  in  D ,  let  them  be  ai, . . .  ,am+iogm*  This  is 
the  secret  key.  The  public  key  is 

pfc  =  F(F(ai),...,F(am+logm)) 

Signing  Algorithm.  Let  M  be  the  message  to  be  signed.  Append  to  M  the  binary  represen¬ 
tation  of  the  number  of  zeros  in  M’s  binary  representation.  Call  M'  the  resulting  binary 
string.  The  signature  of  M  is  «i, . . . ,  sm+iogm  where  s*  =  a*  if  the  ith  bit  of  Mf  is  1  otherwise 

Si  =  F(ai). 
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Verification  Algorithm.  Check  if 


H (t i , . . . ,  tm-j-iog  m)  —  pk 

where  t{  =  $i  if  ith  bit  of  M'  is  0  otherwise  U  =  F(si). 

Security.  Intuitively  this  scheme  is  secure  since  it  is  not  possible  to  change  a  0  into  a  1  in 
the  binary  representation  of  the  message  M  without  having  to  invert  the  function  F.  It 
is  possible  to  change  a  1  into  a  0,  but  that  will  increase  the  number  of  0’s  in  the  binary 
representation  of  M  causing  a  bit  to  flip  from  0  to  1  in  the  last  log  m  bits  of  M;,  and  so 
forcing  the  attacker  to  invert  F  anyway. 

Parameters.  This  scheme  has  signature  length  \D\(m  +  logm)  where  |D|  is  the  number  of 
bits  required  to  represent  elements  of  D.  The  receiver  has  to  compute  1  hash  computation 
of  H  plus  on  the  average  computations  of  F. 

In  practice  we  assume  that  F  maps  64-bit  long  strings  into  64-bit  long  strings.  Since 
collision  resistance  is  not  required  from  F  we  believe  this  parameter  to  be  sufficient.  Conjec¬ 
tured  good  jP’s  can  be  easily  constructed  from  efficient  block  ciphers  like  DES  or  from  fast 
hash  functions  like  MD5  or  SHA-1.  ^  Similarly  H  can  be  instantiated  to  MD5  or  SHA-1. 
In  general  we  may  assume  ui  to  be  128  or  160  if  the  message  to  be  signed  is  first  hashed 
using  MD5  or  SHA-1. 

The  SHA-1  implementation  has  then  signatures  which  are  1344  bytes  long.  The  receiver 
has  to  compute  F  around  84  times  on  the  average.  With  MD5  the  numbers  become  1080 
bytes  and  68  respectively.  When  used  in  our  off-line  scheme  one  also  has  to  add  16  bytes 
for  the  embedding  of  the  public  key  in  the  stream. 

Remark:  Comparing  the  RSA  signature  scheme  with  verification  exponent  3  with  the 
above  schemes,  one  could  wonder  if  the  verification  algorithm  is  really  more  efficient  (2 
multiplications  versus  84  hash  computations).  Typical  estimates  today  are  that  an  RSA 
verification  is  comparable  to  100  hash  computations.  However,  we  remind  the  reader  that 
we  are  trying  to  improve  both  signature  generation  and  verification.  Indeed,  this  scheme  is 
used  in  the  on-line  case  and  as  such  both  operations  have  to  be  performed  on-line  and  thus 
efficiently.  The  improvement  in  signature  generation  is  much  more  substantial. 

Other  schemes:  The  scheme  above  could  be  improved  in  the  length  of  the  signature  by 
using  Winternitz’s  idea  [25].  The  public  key  is  composed  of  m  +  1  elements  of  D,  let  them 
be  ao,ai,..  •  ,am.  The  public  key  is  pk  =  H(Fm(ao),F(ai), . . .  ,F(am)).  The  signature  on 
a  message  M  is  so,si,. ■  •  where  for  i  >  1,  s*  =  a*  if  the  ith  bit  of  M1  is  1  otherwise 
si  -  F(ai ),  while  s0  =  Fe(a0)  where  t  is  the  number  of  l’s  in  M’s  binary  representation. 
The  verification  of  the  signature  takes  the  same  amount  of  computation  as  the  scheme 
described  above.  However,  the  length  of  the  signature  is  slightly  shorter,  i.e.,  \D\(m  +  1) 
bits.  For  a  SHA-1  implementation  this  means  1288  bytes.  As  noticed  in  [12]  the  security 
of  this  scheme  is  based  on  a  somewhat  stronger  assumption  on  F.  Namely  F  is  assumed 
to  be  non-quasi-invertible,  that  is  on  input  y  it  infeasible  to  find  an  index  i  and  a  value 

4  As  a  cautionary  remark  to  prevent  attacks  where  the  attacker  builds  a  large  table  of  evaluations  of  F,  in 
practice  F  could  be  made  different  for  each  signed  stream  (or  for  each  large  portion  of  the  signed  stream)  by 
defining  F(x)  to  be  G(SaZt||X)  where  G  is  a  one-way  128  bit  to  64  bit  function,  and  the  Salt  is  generated 
at  random  by  the  signer  once  for  each  stream  or  large  pieces  thereof. 
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x  such  that  i^1+1(x)  —  F^y).  Clearly  if  F  is  a  one-way  permutation  the  above  notion  is 
automatically  satisfied. 

This  scheme  can  be  further  generalized  as  described  in  [12].  The  message  M  is  split  in 
T  blocks  of  size  *  bitsJ  ^t  them  be  Mi, ... ,  Mm .  The  secret  keys  are  composed  of  f  +  1 
elements  of  D,  let  them  be  ao,  ai , . . . ,  am .  The  public  key  is 

pk  =  H(F^f(a0),F2t-i(a1),...,F^{a!f)). 

The  signature  of  a  message  M  is  computed  by  considering  the  integer  value  of  the  blocks  M{. 
The  signature  is  composed  by  j+1  values  a0,si,...,sm  where,  for  i  >  1,  s;  =  F2*~l~Mi  (ij), 

while  s0  =  F^i  Mi(x0).  In  this  case  the  signature  length  is  |Z)|(™  + 1)  bits,  but  verification 
time  goes  up  with  the  parameter  t.  Indeed,  the  number  of  hash  computations  goes  as 
0(^p).  Also  this  scheme  is  based  on  the  non-quasi-invertibility  of  F. 

In  general  one  has  to  look  at  the  specific  application  and  decide  among  the  tradeoffs 
specified  by  the  parameter  t  in  order  to  decide  if  it  is  better  to  reduce  the  signature  length 
or  the  computation  time.  See  also  Section  7.3  for  other  ways  to  deal  with  this  issue. 

7.2  Non-Repudiation 

In  case  of  a  legal  dispute  over  the  content  of  a  signed  stream  the  receiver  must  bring  to  court 
some  evidence.  If  the  receiver  saves  the  whole  stream,  then  there  is  no  problem.  However, 
in  some  cases,  for  example  because  of  memory  limitations,  the  receiver  might  be  forced  to 
discard  the  stream  data  after  having  consumed  it.  In  these  cases  what  should  she  save  to 
protect  herself  in  case  of  a  legal  dispute? 

In  the  off-line  solution,  assuming  the  last  block  of  the  signed  stream  always  has  a  special 
reserved  value  for  the  hash-chaining  field,  (say  all  0’s)  she  needs  to  save  only  the  first  signed 
block.  Indeed,  this  proves  that  she  received  something  from  the  sender.  Now  we  could 
conceivably  move  the  burden  of  proof  to  the  sender  to  reconstruct  the  whole  stream  that 
matches  that  first  block  and  ends  with  the  last  block  which  has  the  reserved  value  for  the 
hash-chaining  field  (a  similar  idea  was  used  in  [1]  to  deal  with  storage  limitations  on  low-end 
devices). 

Similarly  in  the  on-line  solution,  at  a  minimum  the  receiver  needs  to  save  the  first  signed 
block  and  all  1-time  signatures  and  have  the  sender  reconstruct  the  stream.  However,  in 
practice,  this  may  still  be  too  much  to  save.  For  practical  applications,  we  suggest  the 
following  scheme.  The  on-line  stream  could  be  broken  up  into  a  sequence  of  chunks,  each 
chunk  representing  a  logical  unit,  e.g.,  a  TV  programme  or  a  live  broadcast  of  a  game  or 
even  programming  sent  over  some  fixed  sized  time  interval.  The  idea  is  that  once  a  logical 
unit  is  decided,  an  upper  bound  on  the  total  number  of  blocks  in  it  can  be  estimated  and 
all  the  one-time  keys  needed  for  a  chunk  of  that  size  could  be  precomputed  by  the  sender. 
In  addition,  by  using  the  off-line  stream  signing  technique,  the  sender  can  compute  a  single 
hash  value  which  when  signed  can  be  used  to  authenticate  each  of  these  one-time  public 
keys  if  they  were  to  be  sent  as  a  stream  of  keys,  one  key  in  each  stream  block.  The  new 
on-line  scheme  works  as  follows:  Initially  the  sender  sends  a  regular  digital  signature  on 
the  hash  value  which  can  be  used  to  authenticate  a  stream  of  one-time  public  keys.  The 
stream  of  one-time  keys  is  then  embedded  in  the  actual  on-line  data  stream  and  the  data 
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stream  itself  is  authenticated  by  one-time  signatures  based  on  the  one-time  keys  where  each 
one-time  signature  is  on  the  running  hash  of  all  the  data  sent  so  far  on  the  stream.  This 
way  the  receiver  only  needs  to  store  the  initial  regular  digital  signature  and  the  last  one¬ 
time  signature  that  was  received  and  verified.  For  non-repudiation  purposes,  based  on  the 
initial  regular  digital  signature,  the  sender  can  be  forced  to  disclose  the  entire  stream  of 
one-time  public  keys,  and,  based  on  the  last  valid  one-time  signature  stored  by  the  receiver, 
the  sender  can  be  forced  to  produce  a  data  stream  with  the  same  hash  as  what  is  signed  by 
the  last  one-time  signature  stored  by  the  receiver. 

7.3  Hybrid  Schemes 

In  the  on-line  scheme,  the  length  of  the  embedded  authentication  information  is  of  concern 
as  it  could  cut  into  the  throughput  of  the  stream.  In  order  to  reduce  it,  hybrid  schemes 
can  be  considered.  In  this  case  we  assume  that  some  asynchrony  between  the  sender  and 
receiver  is  acceptable. 

Suppose  the  sender  can  process  a  group  (say  20)  of  stream  blocks  at  a  time  before 
sending  them.  With  a  pipelined  process  this  would  only  add  an  initial  delay  before  the 
stream  gets  transmitted.  The  sender  will  sign  with  a  one-time  key  only  1  block  out  of  20. 
The  20  blocks  in  between  these  two  signed  blocks  will  be  authenticated  using  the  off-line 
scheme.  This  way  the  long  1-time  signatures  and  the  verification  time  can  be  amortized 
over  the  20  blocks. 

A  useful  feature  of  our  proposed  1-time  signature  scheme  is  that  it  allows  the  verification 
of  (the  hash  of)  a  message  bit  by  bit.  This  allows  us  to  actually  “spread  out”  the  signature 
bits  and  the  verification  time  among  the  20  blocks.  Indeed,  if  we  assume  that  the  receiver 
is  allowed  to  play  at  most  20  blocks  of  unauthenticated  information  before  stopping  if 
tampering  is  detected  we  can  do  the  following.  We  can  distribute  the  signature  bits  among 
the  20  blocks  and  verify  the  hash  of  the  first  block  bit  by  bit  as  the  signature  bits  arrive. 
This  maintains  the  stream  rate  stable  since  we  do  not  have  long  signatures  sent  in  a  single 
block  and  verification  now  takes  3-4  hash  computations  per  block,  on  every  block. 

It  is  also  possible  to  remove  the  constraint  on  playing  20  blocks  of  unauthenticated 
information  before  tampering  is  detected.  This  requires  a  simple  modification  to  our  on¬ 
line  scheme.  Instead  of  embedding  in  block  B{  its  own  1-time  signature,  we  embed  the 
signature  of  the  next  block  Bi+\.  This  means  that  in  the  on-line  case  blocks  have  to  be 
processed  two  at  a  time  now.  When  this  modification  is  applied  to  the  hybrid  scheme,  the 
signature  bits  in  the  current  20  blocks  are  used  to  authenticate  the  following  20  blocks  so 
unauthenticated  information  is  never  played.  However,  this  means  that  now  the  sender  has 
to  process  40  blocks  at  a  time  in  the  hybrid  scheme. 

7.4  Probabilistic  One-Time  Signatures. 

The  length  and  the  computation  time  of  a  1-time  signature  are  determined  by  the  length 
of  the  message  being  signed.  Typically  the  message  is  first  hashed,  thus  the  range  of  the 
collision-resistant  hash  function  is  the  crucial  parameter  here.  However,  in  order  to  make 
sure  that  the  function  is  a  strong  collision-resistant  one,  it  is  necessary  to  assume  a  long 
range.  MD5  with  128-bits  is  considered  borderline.  SHA-1  seems  to  give  more  security  with 
a  160-bits  range. 
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Is  it  possible  to  use  a  weak  collision  resistant  hash  function  to  hash  messages  instead? 
In  the  presence  of  a  chosen  message  attack  this  is  not  possible  as  an  attacker  could  find 
two  message  x  and  y  that  hash  to  the  same  value  and  by  obtaining  a  signature  on  x  would 
then  obtain  a  signature  on  y .  It  is  however  possible  to  foil  this  particular  attacks  by  first 
randomizing  the  message  being  signed. 

Consider  the  following  approach.  Let  ((?,  S,  V)  be  a  secure  signature  scheme  (against 
adaptively-chosen  message  attack)  on  a  message  of  size  b ,  but  we  want  to  sign  messages 
of  arbitrary  length.  Let  Hk  be  a  family  of  weak  collision  resistant  hash  functions  whose 
range  is  (b  —  |fc|)-bit  strings.  Consider  the  following  new  signature  scheme  on  messages  of 
arbitrary  size.  On  input  a  message  M,  choose  a  random  fc,  compute  h  =  Hk{M)  and  output 
k  together  with  a  signature  on  the  pair  fc,  h.  We  claim  that  this  new  signature  scheme  is 
also  secure  against  adaptively-chosen  message  attack.  This  is  because  the  receiver  cannot 
use  the  fact  that  Hk  is  a  weak  collision  hash  function,  because  she  does  not  know  which 
function  is  going  to  be  used. 

The  only  issue  here  is  non-repudiation  as  the  signer  can  find  collisions  and  when  it  is 
challenged  with  a  signed  message  M  he  can  present  another  message  Mf  that  has  the  same 
signature.  However,  this  is  not  really  a  problem  as  one  can  hold  the  signer  responsible  for 
any  signed  message  as  only  he  can  find  collisions.  A  more  detailed  discussion  about  these 
issues  can  be  found  in  [3]. 

Notice  that  the  family  Hk  can  be  easily  built  out  of  regular  hash  functions  via  “keying” 
techniques.  For  example  in  iterative  constructions  of  H,  k  could  be  used  as  the  IV.  In 
practice  we  suggest  to  use  the  first  80  bits  of  SHA-1  keyed  via  the  IV  with  k. 

The  above  technique  does  not  have  a  particular  impact  with  typical  signature  schemes 
as  reducing  the  range  of  the  hash  function  is  not  an  issue  there.  But  with  1-time  signatures 
this  allows  for  some  savings  in  the  length  of  the  signature  and  the  computation  time.  For 
example  with  typical  lengths  \k\  would  be  say  60  bits  and  a  weak  collision  hash  function 
would  be  around  80  bits  in  range  (to  obtain  a  level  of  security  comparable  to  the  160-bit 
strong  collision-resistant  functions).  So  the  probabilistic  hashing  would  return  a  value  of 
140  bits  which  implies  a  savings  of  almost  15%  in  length  and  computation  time.  We  stress 
that  this  a  general  result  for  all  1-time  signature  schemes. 

When  applied  to  our  stream  signature  scheme  that  improvement  itself  is  already  valuable. 
But  in  our  specific  case  we  can  improve  even  more  as  we  can  use  the  hybrid  approach  once 
again.  Indeed,  we  don’t  have  to  use  a  different  k  for  each  block.  The  signer  could  keep  k 
fixed  for  a  limited  amount  of  time.  This  time  window  should  be  small  enough  to  prevent 
an  attacker  from  finding  collisions  for  the  hash  function  Hk  which  is  kept  fixed  during  this 
time  window.  If  Hk  is  a  good  hash  function  it  should  take  roughly  240  steps  to  find  such 
collisions.  Thus  a  small  window  of  a  few  seconds  should  not  pose  security  problems.  If  the 
rate  of  the  stream  is  high  enough  during  this  time  window  we  will  sign  several  blocks.  Say 
for  simplicity  that  we  sign  20  blocks  with  the  same  k.  This  will  allow  us  to  spread  the  bits 
of  k  on  20  blocks  which  means  that  now  the  probabilistic  hashing  returns  an  83-bit  value 
per  block,  thus  achieving  almost  a  50%  efficiency  improvement  in  signatures  length  and 
computation  time! 

Remark:  In  order  to  use  this  solution  and  be  able  to  preserve  non-repudiation  the  recipient 
must  save  the  whole  stream,  i.e.,  the  techniques  described  in  Section  7.2  cannot  be  applied. 
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Indeed,  those  techniques  rely  on  the  fact  that  the  signer  cannot  find  a  collision  and  thus 
repudiate  the  message  he  sent  to  the  receiver  (in  case  the  latter  did  not  save  the  original 
message,  but  just  its  hash). 


8  Applications 

MPEG  VIDEO  AND  AUDIO.  In  the  case  of  MPEG  video  and  audio,  there  are  several 
methods  for  embedding  authentication  data.  Firstly,  the  Video  Elementary  stream  has  a 
USER-DATA  section  where  arbitrary  user  defined  information  can  be  placed.  Secondly,  the 
MPEG  system  layer  allows  for  an  elementary  data  stream  to  be  multiplexed  synchronously 
with  the  packetized  audio  and  video  streams.  One  such  elementary  stream  could  carry  the 
authentication  information.  Thirdly,  techniques  borrowed  from  digital  watermarking  can 
be  used  to  embed  information  in  the  audio  and  video  itself  at  the  cost  of  slight  quality 
degradation.  In  the  case  of  MPEG  video  since  each  frame  is  fairly  large,  (hundreds  of 
kilobits)  and  the  receiver  is  required  to  have  a  buffer  of  at  least  1.8Mbits,  both  the  off-line 
as  well  as  the  on-line  solutions  can  be  deployed  without  compromising  picture  quality.  In  the 
case  of  audio  however,  in  the  extreme  case  the  bit  rate  could  be  very  low  (e.g.,  32Kbits/s) 
and  each  audio  frame  could  be  small  (approx.  1000  bytes)  and  the  receiver’s  audio  buffer 
may  be  tiny  ( less  than  2  Kbytes).  In  such  extreme  cases  the  on-line  method,  which  requires 
around  1000  bytes  of  authentication  information  per  block  cannot  be  used  without  seriously 
cutting  into  audio  quality.  For  these  extreme  cases,  the  best  on-line  strategy  would  be  to 
send  the  authentication  information  via  a  separate  but  multiplexed  MPEG  data  stream. 
For  regular  MPEG  audio,  if  the  receiver  has  a  reasonably  sized  buffer  (say  32K)  then  we 
can  apply  the  on-line  scheme  with  a  large  block  size  (say  20  K)  to  obtain  a  signed  MPEG 
audio  stream  with  a  delay  of  approximately  5-6  seconds  and  space  overhead  of  5%.  If  the 
receiver  buffer  is  small  but  not  tiny  (say  3  K)  a  hybrid  scheme  would  work:  for  example  one 
could  use  groups  of  33  hash-chained  blocks  of  length  1000  bytes  each;  this  would  typically 
result  in  a  5%  degradation  and  a  delay  in  the  20  second  range. 

JAVA.  In  the  original  version  of  java  (JDK  1.0),  for  an  applet  coming  from  the  network,  first 
the  startup  class  was  loaded  and  then  additional  classes  were  (down)  loaded  by  the  class 
loader  in  a  lazy  fashion  as  and  when  the  running  applet  first  attempted  to  access  them. 
Since  our  ideas  apply  not  only  to  streams  which  are  a  linear  sequence  of  blocks  but  in  general 
to  trees  as  well  (where  one  block  can  invoke  any  of  its  children),  based  on  our  model,  one 
way  to  sign  java  applets  would  be  to  sign  the  startup  class  and  each  downloaded  class 
would  have  embedded  in  it  the  hashes  of  the  additional  classes  that  it  downloads  directly. 
However,  for  code  signing,  JavaSoft  has  adopted  the  multiple  signature  and  hash  table  based 
approach  in  JDK1.1,  where  each  applet  is  composed  of  one  or  several  Java  archives,  each 
of  which  contains  a  signed  table  of  hashes  (the  manifest)  of  its  components.  It  is  our  belief 
that  once  java  applets  become  really  large  and  complex  the  shortcomings  of  this  approach 
will  become  apparent:  (1)  the  large  size  of  the  hash  table  in  relation  to  the  classes  actually 
invoked  during  a  run.  This  table  has  to  be  fully  extracted  and  authenticated  before  any 
class  gets  authenticated;  (2)  the  computational  cost  of  signing  each  of  the  manifests  if  an 
applet  is  composed  of  several  archives;  (3)  accommodating  classes  or  data  resources  which 
are  generated  on  the  fly  by  the  application  server  based  on  a  client  request. 
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These  could  be  addressed  by  using  some  of  our  techniques.  Also  the  problem  of  how  to 
sign  audio/video  streams  will  have  to  be  considered  in  the  future  evolution  of  Java,  since 
putting  the  hash  of  a  large  audio/video  file  in  the  manifest  would  not  be  acceptable. 

Broadcast  Applications.  Our  schemes  (both  the  off-line  and  the  on-line  one)  can  be 
easily  modified  to  fit  in  a  broadcast  scenario.  Assume  that  the  stream  is  being  sent  to  a 
broadcast  channel  with  multiple  receivers  who  dynamically  join  or  leave  the  channel.  In 
this  case  a  receiver  who  joins  when  the  transmission  is  already  started  will  not  be  able 
to  authenticate  the  stream  since  she  missed  the  first  block  that  contained  the  signature. 
Both  schemes  however  can  be  modified  so  that  every  once  in  a  while  apart  from  the  regular 
chaining  information,  there  will  also  be  a  regular  digital  signature  on  a  block  embedded  in 
the  stream.  Receivers  who  are  already  verifying  the  stream  via  the  chaining  mechanism  can 
ignore  this  signature  whereas  receivers  tuned  in  at  various  time  will  rely  on  the  first  such 
signature  they  encounter  to  start  their  authentication  chain. 

A  different  method  to  authenticate  broadcasted  streams,  with  weaker  non-repudiation 
properties  than  ours,  is  proposed  in  [5]. 

Long  Files  when  Communication  is  Costly.  Out  solution  can  be  used  also  to  au¬ 
thenticate  long  files  in  a  way  to  reduce  communication  cost  in  case  of  tampering.  Suppose 
that  a  receiver  is  downloading  a  long  file  from  the  Web.  There  is  no  ’’stream  requirement” 
to  consume  the  file  as  it  is  downloaded,  so  the  receiver  could  easily  receive  the  whole  file 
and  then  check  a  signature  at  the  end.  However,  if  the  file  has  been  tampered  with,  the 
user  will  be  able  to  detect  this  fact  only  at  the  end.  Since  communication  is  at  a  cost  (time 
spent  online,  bandwidth  wasted  etc)  this  is  not  a  satisfactory  solution.  Using  our  solution 
the  receiver  can  interrupt  the  transmission  as  soon  as  tampering  is  detected  thus  saving 
precious  communication  resources. 

Signature-based  content  filtering  at  proxies.  Recently  there  has  been  interest  in 
using  digital  signatures  as  a  possible  way  to  filter  content  admitted  in  by  proxy  servers 
through  firewalls.  Essentially  when  there  is  a  firewall  and  one  wishes  to  connect  to  an 
external  server,  then  this  connection  can  only  be  done  via  a  proxy  server.  In  essence  one 
establishes  a  connection  to  a  proxy  and  the  proxy  establishes  a  separate  connection  to  the 
external  server  (if  that  is  permitted).  The  proxy  then  simulates  a  connection  between  the 
internal  machine  and  the  external  machine  by  copying  data  between  the  two  connections. 
There  has  been  some  interest  in  modifying  proxies  so  that  they  would  only  allow  signed 
data  to  flow  from  the  external  server  to  the  internal  machine.  However,  since  the  proxy  is 
only  copying  data  as  it  arrives  from  the  external  connection  into  the  internal  connection 
and  it  cannot  store  all  the  incoming  data  before  transferring  it,  the  proxy  cannot  use  a 
regular  signature  scheme  for  solving  this  problem.  However,  it  is  easy  to  see  that  in  the 
proxy’s  view  the  data  is  a  stream.  Hence  if  there  could  be  some  standardized  way  to  embed 
authentication  data  in  such  streams,  then  techniques  from  this  paper  would  prove  useful  in 
solving  this  problem. 
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Abstract 

Multicast  stream  authentication  and  signing  is  an  im¬ 
portant  and  challenging  problem.  Applications  include  the 
continuous  authentication  of  radio  and  TV  Internet  broad¬ 
casts,  and  authenticated  data  distribution  by  satellite.  The 
main  challenges  are  fourfold '  First,  authenticity  must  be 
guaranteed  even  when  only  the  sender  of  the  data  is  trusted. 
Second,  the  scheme  needs  to  scale  to  potentially  millions  of 
receivers.  Third,  streamed  media  distribution  can  have  high 
packet  loss.  Finally,  the  system  needs  to  be  efficient  to  sup¬ 
port  fast  packet  rates. 

We  propose  two  efficient  schemes,  TESLA  and  EMSS, 
for  secure  lossy  multicast  streams.  TESLA,  short  for  Timed 
Efficient  Stream  Loss-tolerant  Authentication,  offers  sender 
authentication,  strong  loss  robustness,  high  scalability,  and 
minimal  overhead,  at  the  cost  of  loose  initial  time  synchro¬ 
nization  and  slightly  delayed  authentication.  EMSS,  short 
for  Efficient  Multi-chained  Stream  Signature,  provides  non¬ 
repudiation  of  origin,  high  loss  resistance,  and  low  over¬ 
head,  at  the  cost  of  slightly  delayed  verification. 
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1  Introduction 

As  the  online  population  continues  to  expand,  the  Inter¬ 
net  is  increasingly  used  to  distribute  streamed  media,  such 
as  streamed  radio  and  video.  We  expect  this  trend  to  con¬ 
tinue. 

To  enable  a  widespread  and  trusted  streamed  media  dis¬ 
semination,  one  must  first  provide  sufficient  security  guar¬ 
antees.  A  most  prominent  security  risk  from  a  user  point 
of  view  is  data  authenticity.  The  user  needs  assurance  that 
the  data  stream  originated  from  the  purported  sender.  Oth¬ 
erwise,  a  malicious  ISP  could  replace  parts  of  the  stream 
with  its  own  material.  For  example,  an  adversary  might  al¬ 
ter  stock  quotes  that  are  distributed  through  IP  multicast. 
In  that  scenario,  the  receiver  needs  strong  sender  and  data 
authentication. 

The  problem  of  continuous  stream  authentication  is 
solved  for  the  case  of  one  sender  and  one  receiver  via  stan¬ 
dard  mechanisms,  e.g.  [12,  18].  The  sender  and  receiver 
agree  on  a  secret  key  which  is  used  in  conjunction  with  a 
message  authenticating  code  (MAC)  to  ensure  authenticity 
of  each  packet.  In  case  of  multiple  receivers,  however,  the 
problem  becomes  much  harder  to  solve,  because  a  symmet¬ 
ric  approach  would  allow  anyone  holding  a  key  (that  is,  any 
receiver)  to  forge  packets.  Alternatively,  the  sender  can  use 
digital  signatures  to  sign  every  packet  with  its  private  key. 
This  solution  provides  adequate  authentication,  but  digital 
signatures  are  prohibitively  inefficient. 

Real-time  data  streams  are  lossy,  which  makes  the  secu¬ 
rity  problem  even  harder.  With  many  receivers,  we  typically 
have  a  high  variance  among  the  bandwidth  of  the  receivers, 
with  high  packet  loss  for  the  receivers  with  relatively  low 
bandwidth.  Nevertheless,  we  want  to  assure  data  authentic¬ 
ity  even  in  the  presence  of  this  high  packet  loss. 

A  number  of  schemes  for  solving  this  problem  (i.e.  au¬ 
thenticating  the  data  and  sender  in  a  setting  where  only  the 
sender  is  trusted)  have  been  suggested  in  the  past  few  years 
[7, 13, 28, 31],  but  none  of  these  schemes  is  completely  sat- 
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isfactory.  We  discuss  these  schemes  in  section  4. 

This  paper  presents  two  very  different  solutions  to  the 
problem  of  authenticating  data  streams  efficiently  in  a 
lossy  environment.  The  first  solution,  called  TESLA  (for 
Timed  Efficient  Stream  Loss-tolerant  Authentication),  uses 
only  symmetric  cryptographic  primitives  such  as  pseudo¬ 
random  functions  (PRFs)  and  message  authentication  codes 
(MACs),  and  is  based  on  timed  release  of  keys  by  the 
sender.  More  specifically,  the  scheme  is  based  on  the  fol¬ 
lowing  idea:  The  sender  commits  to  a  random  key  k  without 
revealing  it  and  transmits  it  to  the  receivers.  The  sender  then 
attaches  a  message  authenticating  code  to  the  next  packet  P* 
and  uses  the  key  k  as  the  MAC  key.  In  a  later  packet  pi+i, 
the  sender  decommits  to  fc,  which  allows  the  receivers  to 
verify  the  commitment  and  the  MAC  of  packet  P*.  If  both 
verifications  are  correct,  and  if  it  is  guaranteed  that  packet 
Pi+t  was  not  sent  before  packet  P*  was  received,  then  a  re¬ 
ceiver  knows  that  the  packet  P»  is  authentic.  To  start  this 
scheme,  the  sender  uses  a  regular  signature  scheme  to  sign 
the  initial  commitment.  All  subsequent  packets  are  authen¬ 
ticated  through  chaining. 

Our  first  scheme,  TESLA,  has  the  following  properties: 

•  Low  computation  overhead.  The  authentication  in¬ 
volves  typically  only  one  MAC  function  and  one  hash 
function  computation  per  packet,  for  both  sender  and 
receiver. 

•  Low  per-packet  communication  overhead.  Overhead 
can  be  as  low  as  10  bytes  per  packet. 

•  Arbitrary  packet  loss  tolerated.  Every  packet  which  is 
received  in  time  can  be  authenticated. 

•  Unidirectional  data  flow.  Data  only  flows  from  the 
sender  to  the  receiver.  No  acknowledgments  or  other 
messages  are  necessary  after  connection  setup.  This 
implies  that  the  sender’s  stream  authentication  over¬ 
head  is  independent  on  the  number  of  receivers,  so  our 
scheme  is  very  scalable. 

•  No  sender-side  buffering.  Every  packet  is  sent  as  soon 
as  it  is  ready. 

•  High  guarantee  of  authenticity.  The  system  provides 
strong  authenticity.  By  strong  authenticity  we  mean 
that  the  receiver  has  a  high  assurance  of  authenticity, 
as  long  as  our  timing  and  cryptographic  assumptions 
are  enforced.1 

•  Freshness  of  data.  Every  receiver  knows  an  upper 
bound  on  the  propagation  time  of  the  packet. 

1  However,  the  scheme  does  not  provide  non-repudiation.  That  is,  the 
recipient  cannot  convince  a  third  party  that  the  stream  arrived  from  the 
claimed  source. 


The  second  scheme,  called  EMSS  (for  Efficient  Multi- 
chained  Stream  Signature),  is  based  on  signing  a  small  num¬ 
ber  of  special  packets  in  a  data  stream;  each  packet  is  linked 
to  a  signed  packet  via  multiple  hash  chains.  This  is  achieved 
by  appending  the  hash  of  each  packet  (including  possible 
appended  hashes  of  previous  packets)  to  a  number  of  sub¬ 
sequent  packets.  Appropriate  choice  of  parameters  to  the 
scheme  guarantees  that  almost  all  arriving  packets  can  be 
authenticated,  even  over  highly  lossy  channels.  The  main 
features  of  this  scheme  are: 

•  It  amortizes  the  cost  of  a  signature  operation  over  mul¬ 
tiple  packets,  typically  about  one  signature  operation 
per  100  to  1000  packets. 

•  It  tolerates  high  packet  loss. 

•  It  has  low  communication  overhead,  between  20  to  50 
bytes  per  packet,  depending  on  the  requirements. 

•  It  provides  non-repudiability  of  the  sender  to  the  trans¬ 
mitted  data. 

2  TESLA:  Timed  Efficient  Stream  Loss- 
tolerant  Authentication 

In  this  section,  we  describe  five  schemes  for  stream  au¬ 
thentication.  Each  scheme  builds  up  on  the  previous  one 
and  improves  it  to  solve  its  shortcomings.  Finally,  scheme 
V,  which  we  call  TESLA  (short  for  Timed  Efficient  Stream 
Loss-tolerant  Authentication),  satisfies  all  the  properties  we 
listed  in  the  introduction.  The  cryptographic  primitives  used 
in  this  section  are  reviewed  in  Appendix  A,  which  also  con¬ 
tains  a  sketch  of  a  security  analysis  for  our  scheme. 

We  use  the  following  notation:  (x,y)  denotes  the  con¬ 
catenation  of  x  and  y,  S  stands  for  sender,  and  R  stands  for 
receiver.  A  stream  <S  is  divided  into  chunks  Mi  (which  we 
also  call  messages),  S  =  (MUM2,. . .  ,M/).  Each  mes¬ 
sage  Mi  is  sent  in  a  packet  P*,  along  with  additional  au¬ 
thentication  information. 

2.1  Threat  Model  and  security  guarantee 

We  design  our  schemes  to  be  secure  against  a  powerful 
adversary  with  the  following  capabilities: 

•  Full  control  over  the  network.  The  adversary  can 
eavesdrop,  capture,  drop,  resend,  delay,  and  alter  pack¬ 
ets. 

•  The  adversary  has  access  to  a  fast  network  with  negli¬ 
gible  delay. 

•  The  adversary’s  computational  resources  may  be  very 
large,  but  not  unbounded.  In  particular,  this  means  that 
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the  adversary  can  perform  efficient  computations,  such 
as  computing  a  reasonable  number  of  pseudo-random 
function  applications  and  MACs  with  negligible  delay. 
Nonetheless  the  adversary  cannot  invert  a  pseudoran¬ 
dom  function  (or  distinguish  it  from  a  random  func¬ 
tion)  with  non-negligible  probability. 

The  security  property  we  guarantee  is  that  the  receiver 
does  not  accept  as  authentic  any  message  M*  unless  Mi  was 
actually  sent  by  the  sender.  A  scheme  that  provides  this 
guarantee  is  called  a  secure  stream  authentication  scheme . 

Note  that  the  above  security  requirements  do  not  include 
protection  against  message  duplication.  Such  protection  can 
(and  should)  be  added  separately  by  standard  mechanisms, 
such  as  nonces  or  serial  numbers.  Schemes  I-III  below  do 
have  protection  against  message  duplication.  Note  also  that 
we  do  not  address  denial-of-service  attacks. 

2.2  Initial  synchronization  (preliminary  discus¬ 
sion) 

All  five  schemes  below  begin  with  an  initial  synchroniza¬ 
tion  protocol  where  each  receiver  compares  its  local  time 
with  that  of  the  sender,  and  registers  the  difference.  We  re¬ 
mark  that  a  rough  upper  bound  on  the  clock  difference  is 
sufficient.  In  fact,  all  that  the  receiver  needs  is  a  value  5 
such  that  the  sender’s  clock  is  no  more  than  8  time-units 
ahead  of  the  receiver’s  clock,  where  <5  can  be  on  the  order 
of  multiple  seconds.2  In  section  2.8  we  describe  a  simple 
protocol  and  discuss  scalability  issues  related  to  the  initial 
synchronization. 

A  basic  assumption  that  underlies  the  security  of  our 
scheme  is  that  the  local  internal  clocks  of  the  sender  and 
recipient  do  not  drift  too  much  during  a  session. 

2.3  Scheme  Is  The  Basic  Scheme 

Here  is  a  summary  of  scheme  I:  The  sender  issues  a 
signed  commitment  to  a  key  which  is  only  known  to  it¬ 
self.  The  sender  then  uses  that  key  to  compute  a  MAC 
on  a  packet  P»,  and  later  discloses  the  key  in  packet  Pi+i, 
which  enables  the  receiver  to  verify  the  commitment  and 
the  MAC  of  packet  Pit  If  both  verifications  are  successful, 
packet  Pi  is  authenticated  and  trusted.  The  commitment 
is  realized  via  a  pseudorandom  function  with  collision  re¬ 
sistance.  More  details  on  the  requirements  on  the  pseudo¬ 
random  functions  are  in  appendix  A.  This  protocol  is  simi¬ 
lar  to  the  Guy  Fawkes  protocol  [1]. 

We  now  describe  the  basic  scheme  in  more  detail. 
The  scheme  is  depicted  in  Figure  1.  We  assume 
that  the  receiver  has  an  authenticated  packet  P*_ 1  = 

2Many  clock  synchronization  algorithms  exist,  for  example  the  work  of 
Mills  on  NTP  [22],  and  its  security  analysis  [5]. 
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Figure  1.  Basic  stream  authentication 
scheme.  Mi  stands  for  message  i,  Pi 
is  packet  i,  Ki  denotes  the  secret  key 
i,  F,F'  are  pseudo-random  functions, 
and  MA computes  the  MAC 
of  packet  i  using  the  secret  key  K\  — 
F'{Ki ). 


(Dj_i,MAC(ATJ_1? Di-i))  to  start  with  (where  Dj_i  = 
{Mi^uF(Ki),Ki^)y  The  fields  have  the  following 
meanings,  is  the  message  contained  by  the  packet, 

KI  =  Fl(Ki)  is  the  secret  key  used  to  compute  the  MAC 
of  the  next  packet,  and  F(Ki)  commits  to  the  key  Ki  with¬ 
out  revealing  it.  The  functions  F  and  Fr  are  two  different 
pseudo-random  functions.  Commitment  value  F(Ki)  is  im¬ 
portant  for  the  authentication  of  the  subsequent  packet  P*. 
To  bootstrap  this  scheme,  the  first  packet  needs  to  be  au¬ 
thenticated  with  a  regular  digital  signature  scheme,  for  ex¬ 
ample  RSA  [27]. 

To  send  the  message  Mi,  the  sender  picks  a  fresh  ran¬ 
dom  key  Ki+i  and  constructs  the  following  packet  Pi  - 
(Dj,MAC(Kj,Di)),  where  =  (Mi,F(Ki+1),Ki^1) 
and  the  MA C(K^Di)  computes  a  message  authenticating 
code  of  Di  under  key  K[. 

When  the  receiver  receives  packet  Piy  it  cannot  verify  the 
MAC  instantly,  since  it  does  not  know  Ki  and  cannot  re¬ 
construct  K\.  Packet  Pi+i  =  (Di+i,MAC(Kri+l,  Di+i)) 
(where  Di+ 1  =  (Mi+i,F(Ki+2),Ki))  discloses  Ki  and 
allows  the  receiver  first  to  verify  that  Ki  is  correct  ( F(Ki ) 
equals  the  commitment  which  was  sent  in  packet  Pi- 1);  and 
second  to  compute  K[  =  F’(Ki)  and  check  the  authenticity 
of  packet  Pj  by  verifying  the  MAC  of  packet  P<. 

After  the  receiver  has  authenticated  Pj,  the  commitment 
F(Ki+i)  is  also  authenticated  and  the  receiver  repeats  this 
scheme  to  authenticate  Pi+\  after  P*+ 2  is  received. 

This  scheme  can  be  subverted  if  an  attacker  gets  packet 
Pi+1  before  the  receiver  gets  Piy  since  the  attacker  would 
then  know  the  secret  key  Ki  which  is  used  to  compute  the 
MAC  of  Pi,  which  allows  it  to  change  the  message  and  the 
commitment  in  Pi  and  forge  all  subsequent  traffic.  To  pre¬ 
vent  this  attack,  the  receiver  checks  the  following  security 
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condition  on  each  packet  it  receives,  and  drops  the  packet  if 
the  condition  does  not  hold. 

Security  condition:  A  data  packet  P*  arrived  safely , 
if  the  receiver  can  unambiguously  decide,  based  on  its 
synchronized  time  and  St,  that  the  sender  did  not  yet 
send  out  the  corresponding  key  disclosure  packet  Pj. 

This  stream  authentication  scheme  is  secure  as  long  as  the 
security  condition  holds.  We  would  like  to  emphasize  that 
the  security  of  this  scheme  does  not  rely  on  any  assumptions 
on  network  latency. 

In  order  for  the  receiver  to  verify  the  security  condition, 
the  receiver  needs  to  know  the  precise  sending  schedule  of 
packets.  The  easiest  way  to  solve  this  problem  is  by  using  a 
constant  packet  rate.  The  sending  time  of  packet  P*  is  hence 
Ti  =  Tq  +  i/r  where  T*  is  the  time  on  the  sender’s  clock 
and  r  is  the  packet  rate  (number  of  packets  per  second).  In 
that  case,  the  security  condition  which  the  receiver  checks 
has  the  following  form:  ArrTi  +  6t  <  Tf+i  ,  where  ArrTi 
stands  for  the  arrival  time  (on  the  synchronized  receiver’s 
clock)  of  packet  P The  main  problem  with  this  scheme  is 
that,  in  order  to  satisfy  the  security  condition,  the  sending 
rate  must  be  slower  than  the  network  delay  trom  the  sender 
to  the  receiver.  This  is  a  severe  limitation  on  the  throughput 
of  the  transmission.  In  addition,  the  basic  scheme  cannot 
tolerate  packet  loss.  In  particular,  once  a  packet  is  dropped 
no  further  packets  can  be  authenticated.  We  now  gradually 
extend  the  basic  scheme  to  eliminate  these  deficiencies. 

2.4  Scheme  II:  Tolerating  Packet  Loss 

To  authenticate  lossy  multimedia  streams,  tolerating 
packet  loss  is  paramount.  Our  solution  is  to  generate  a  se¬ 
quence  of  keys  {if*}  through  a  sequence  generated  through 
pseudo-random  function  applications.  We  denote  v  con¬ 
secutive  applications  of  the  pseudo-random  function  F  as 
Pv(a:)  =  Fv~l(F(x)).  By  convention,  F°(x)  =  x.  The 
sender  picks  a  random  Kn  and  pre-computes  a  sequence  of 
n  key  values,  where  K0  =  Pn(ATn),  and  K{  =  Fn^(Kn). 
We  call  this  sequence  of  values  the  key  chain.  Each 
looks  pseudorandom  to  an  attacker;  in  particular,  given  Ki> 
the  attacker  cannot  invert  F  and  compute  any  Kj  for  j  >  i. 
On  the  other  hand,  the  receiver  can  compute  all  Kj  from  a 
K{  it  received,  where;  <  i,  since  Kj  =  F^fjCi).  Hence, 
if  a  receiver  received  packet  P*,  any  subsequently  received 
packet  will  allow  it  to  compute  Ki  and  K[  =  F'(Ki)  and 
verify  the  authenticity  of  P This  scheme  tolerates  an  arbi¬ 
trary  number  of  packet  losses. 

Similarly,  dropping  unsafe  packets  (i.e.  those  packets 
where  the  security  condition  does  not  hold)  does  not  cause 
any  problems  in  the  authentication  of  later  packets. 

In  the  basic  scheme  I,  an  adversary  might  try  to  capture 
two  consecutive  packets  before  the  recipient  received  the 


first  of  them,  and  then  forge  the  packet  stream.  Although  the 
security  condition  prevents  this,  the  key  chain  also  prevents 
this  attack,  because  the  initial  commitment  commits  to  the 
entire  key  chain  and  it  is  computationally  infeasible  for  the 
attacker  to  invert  or  find  collisions  in  the  pseudo-random 
function.3 

An  additional  benefit  is  that  the  key  commitment  does 
not  need  to  be  embedded  in  each  packet  any  more.  Due  to 
the  intractability  of  inverting  the  pseudo-random  function, 
any  value  of  the  chain  is  a  commitment  for  the  entire  chain. 
Hence  the  commitment  in  the  initial  authenticated  packet  is 
sufficient.  Figure  2  shows  an  example  of  scheme  II. 
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Figure  2.  Scheme  II.  The  packet  format 
is  the  same  as  in  scheme  I,  except  that  the 
commitment  F(K^ i)  is  omitted  and  the 
keys  form  a  one-way  key  chain. 


2.5  Scheme  III:  Achieving  Fast  Transfer  Rates 

As  we  mentioned  earlier,  the  receiver  needs  to  be  assured 
that  it  receives  the  packet  Pi  before  the  corresponding  key 
disclosure  packet  P*+1  is  sent  by  the  sender.  This  condi¬ 
tion  severely  limits  the  transmission  rate  of  the  previous  two 
schemes  since  1  can  only  be  sent  after  every  receiver  has 
received  Pj. 

We  solve  this  problem  by  disclosing  the  key  Ki  of  the 
data  packet  P*  in  a  later  packet  Pi+d,  instead  of  in  the  fol¬ 
lowing  packet,  where  d  is  a  delay  parameter  that  is  set  by 
the  sender  and  announced  as  the  session  set-up. 

The  sender  determines  the  delay  d  based  on  the  packet 
rate  r,  the  maximum  tolerable  synchronization  uncertainty 

3I.e.,  it  is  infeasible,  given  Ki  ~  F(KW)  to  find  K’i+1  such 
that  =  K{.  Even  if  the  attacker  could  find  such  a  collision 

then  h  would  be  able  to  forge  only  a  single  message 
Mi+j.  Forging  additional  messages  would  require  inverting  F,  i.e.,  find¬ 
ing  K[+2  such  that  F(Fj+2)  =  K'i+V 
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<5tMax,  and  the  maximum  tolerable  network  delay  dNMax- 
Setting  d  =  \(St Max  4  dwMaxJrl  allows  the  receiver  to  suc¬ 
cessfully  verify  the  security  condition  even  in  the  case  of 
maximum  allowable  network  delay  and  maximal  synchro¬ 
nization  error.  The  choice  of  tftMax  and  c^iMax  presents  the 
following  tradeoff:  Large  delay  values  will  cause  a  large 
d  which  results  in  long  delays  until  the  packet  authentica¬ 
tion.  On  the  other  hand,  short  maximum  delays  cause  the 
the  security  condition  to  drop  packets  at  receivers  with  a 
slow  network  connection.  However,  multimedia  data  pack¬ 
ets  become  obsolete  if  they  are  received  after  their  segment 
of  the  stream  was  already  played  or  presented  to  the  user.  In 
that  case,  dropping  unsafe  packets  might  not  interfere  with 
the  multimedia  stream  since  the  packets  are  likely  to  be  ob¬ 
solete.  We  stress  that  the  choice  of  d  does  not  affect  the 
security  of  the  scheme,  only  its  usability. 

For  the  case  of  a  constant  packet  rate,  the  security  con¬ 
dition  is  easy  to  state.  We  assume  that  the  sending  time  of 
the  first  packet  is  T0  and  the  sending  time  of  packet  is 
Ti  =  To  4  i/r-  T°  verify  the  security  condition  for  an  in¬ 
coming  packet,  the  receiver  checks  that  ArrTi  4*  St  <  Ti+d, 
where  ArrTi  is  the  arrival  time  of  packet  P*  at  the  receiver. 

2.6  Scheme  IV:  Dealing  with  Dynamic  Packet 
Rates 

Our  previous  schemes  used  a  fixed  or  predictable  sender 
schedule,  with  each  recipient  knowing  the  exact  sending 
time  of  each  packet.  Since  this  severely  restricts  the  flexibil¬ 
ity  of  senders,  we  design  a  scheme  which  allows  senders  to 
send  at  dynamic  transmission  rates,  without  the  requirement 
that  every  receiver  needs  to  know  about  the  exact  sending 
schedule  of  each  packet.  The  solution  to  this  problem  is  to 
pick  the  MAC  key  and  the  disclosed  key  in  each  packet  only 
on  a  time  interval  basis  instead  of  on  a  packet  index  basis. 
The  sender  uses  the  same  key  Kj  to  compute  the  MAC  for 
all  packets  which  are  sent  in  the  same  interval  i.  All  packets 
sent  in  interval  i  disclose  the  key 

At  session  set-up  the  sender  announces  values  To  and 
Ta,  where  the  former  is  the  starting  time  of  the  first  interval 
and  the  latter  is  the  duration  of  each  interval.  In  addition 
the  delay  parameter  d  is  announced.  These  announcements 
are  signed  by  the  sender.  The  interval  index  at  any  time 
period  t  is  determined  as  i  =  A  key  K{  is  as¬ 

sociated  with  each  interval  i.  The  keys  are  chained  in  the 
same  way  as  in  Scheme  II.  The  sender  uses  the  same  key 
K[  =  F'(Kj)  to  compute  the  MAC  for  each  packet  which 
is  sent  in  interval  i.  Every  packet  also  carries  the  interval 
index  i  and  discloses  the  key  of  a  previous  interval  Ki-d. 
We  refer  to  d  as  disclosure  lag.  The  format  of  packet  Pj 
is  Pj  =  Ki-d,  MA C(ff{,  Mj)).  Figure  3  shows  an 

example  of  this  scheme,  where  d  =  4. 

In  this  scheme,  the  receiver  verifies  the  security  con¬ 
dition  as  follows.  Each  receiver  knows  the  values  of  To, 


Ta,  and  St.  (St  is  the  value  obtained  from  the  initial 
synchronization  protocol.)  Assume  that  the  receiver  gets 
packet  Pj  at  its  local  time  tjy  and  the  packet  was  appar¬ 
ently  sent  in  interval  i.  The  sender  can  be  at  most  in  interval 
l»  =  [e>+yt~T°  J .  The  security  condition  in  this  case  is  sim¬ 
ply  i  4  d  >  i\  which  ensures  that  no  packet  which  discloses 
the  value  of  the  key  could  have  been  sent  yet.  Figure  4  il¬ 
lustrates  the  verification  of  the  security  condition. 

It  remains  to  describe  how  the  values  Ta  and  d  are 
picked.  (We  stress  that  the  choice  of  these  values  does  not 
affect  the  security  of  the  scheme,  only  its  usability.)  Be¬ 
fore  the  sender  can  pick  values  for  Ta  and  d,  it  needs  to  de¬ 
termine  the  maximum  tolerable  synchronization  uncertainty 
£tMax>  and  the  maximum  tolerable  network  delay  c/nmex-  The 

sender  defines  Amox  d=  ^tMax  4  c^NMax 

The  sender’s  choice  for  TA  and  AMax  both  present  a 
tradeoff.  First,  a  large  value  for  AMax  will  allow  slow  re¬ 
ceivers  to  verify  the  security  condition  correctly,  but  re¬ 
quires  a  long  delay  for  packet  authentication.  Conversely, 
a  short  Amex  will  cause  slow  receivers  to  drop  packets  be¬ 
cause  the  security  condition  is  not  satisfied.  The  second 
tradeoff  is  that  a  long  interval  duration  Ta  saves  on  the  com¬ 
putation  and  storage  overhead  of  the  key  chain,  but  a  short 
Ta  more  closely  achieves  the  desired  AMax- 

After  determining  <5tMax,  dNMax,  and  Ta,  the  disclosure 
lag  is 

This  scheme  provides  numerous  advantages.  First,  the 
sender  can  predict  how  long  a  pre-computed  key  chain  lasts, 
since  the  number  of  necessary  keys  is  only  time  dependent 
and  not  on  the  number  of  packets  sent.  Second,  the  re¬ 
ceiver  can  conveniently  verify  the  security  condition  and 
the  sender  does  not  need  to  send  its  packets  at  specific  in¬ 
tervals  (we  will  discuss  the  details  of  this  in  Section  2.9). 
Another  advantage  is  that  new  receivers  can  easily  join  the 
group  at  any  moment.  A  new  group  member  only  needs  to 
synchronize  its  time  with  the  sender  and  receive  the  interval 
parameters  and  a  commitment  to  the  key  chain. 

2.7  Scheme  V:  Accommodate  a  Broad  Spectrum 
of  Receivers 

For  the  previous  schemes,  we  showed  that  there  was  a 
tradeoff  in  the  choice  of  the  key  disclosure  period.  If  the 
time  difference  is  short,  the  packet  can  be  authenticated 
quickly,  but  if  the  packet  travel  time  is  long  the  security 
condition  will  not  hold  for  remote  receivers,  which  forces 
them  to  drop  the  packet.  Conversely,  a  long  time  period 
will  suit  remote  receivers,  but  the  authentication  time  delay 
may  be  unacceptable  for  receivers  with  fast  network  access. 
Since  the  scheme  needs  to  scale  to  a  large  number  of  re¬ 
ceivers  and  we  expect  the  receivers  to  have  a  wide  variety 
of  network  access,  we  need  to  solve  this  tradeoff.  Our  ap¬ 
proach  is  to  use  multiple  authentication  chains  (where  each 
chain  is  as  in  scheme  IV)  with  different  disclosure  periods 
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Figure  3.  Scheme  IV.  The  MAC  key  and  disclosed  key  are  only  dependent  on  the  time  in¬ 
terval.  The  authentication  key  of  Pj  is  which  is  disclosed  by  packets  sent  during  interval 
i  +  4.  In  this  case,  packet  PJ+4  discloses  key  Ki+i  which  allows  the  receiver  to  compute  Ki 
and  to  authenticate  packet  Pj.  We  would  like  to  point  out  that  packets  Pj+2  and  Pi+3  are  both 
authenticated  with  the  same  MAC  key  K[+ 3,  because  they  were  sent  in  the  same  time  interval. 


pi  pi 


sent  received 

Figure  4.  The  security  condition  visualized.  The  packet  Pj  is  sent  in  the  interval  where  key 
Ki+i  is  active.  The  receiver  receives  the  packet  when  the  sender  is  in  interval  i  -f-  3,  but 
due  to  the  St  the  sender  might  already  be  in  interval  i  +  4,  which  discloses  key  K\.  This  is 
not  a  problem  for  the  current  packet,  so  key  i  was  not  disclosed  yet,  hence  the  security 
condition  is  satisfied  and  the  packet  is  safe. 


simultaneously.  Each  receiver  can  then  use  the  chain  with 
the  minimal  disclosure  delay,  sufficient  to  prevent  spurious 
drops  which  are  caused  if  the  security  condition  does  not 
hold. 

The  receiver  verifies  one  security  condition  for  each  au¬ 
thentication  chain  Cu  and  drops  the  packet  if  none  of  the 
conditions  are  satisfied.  Assume  that  the  sender  uses  n  au¬ 
thentication  chains,  where  the  first  chain  has  the  smallest 
delay  until  the  disclosure  packet  is  sent,  and  the  nth  chain 
has  the  longest  delay.  Furthermore,  assume  that  for  the 
incoming  packet  Pj,  the  security  conditions  for  chains  Cv 
(v  <  m)  are  not  satisfied,  and  the  condition  for  chain  Cm  is 
satisfied.  In  this  case,  as  long  as  the  key  disclosure  packets 
for  the  chains  Cv  ( v  <  m)  arrive,  the  receiver’s  confidence 
in  the  authenticity  of  packet  Pj  is  increasing.  As  soon  as 
the  key  disclosure  packet  for  a  chain  Cv  ( v  >  m)  arrives, 
the  receiver  is  assured  of  the  authenticity  of  the  packet  Pj. 


2.8  Initial  Synchronization  -  Further  Discussion 

Our  stream  authentication  scheme  relies  on  a  loose  time 
synchronization  between  the  sender  and  all  the  recipients. 
We  call  this  synchronization  loose,  because  the  synchro¬ 
nization  error  can  be  large.  The  only  requirement  we  have 
is  that  the  client  knows  an  upper  bound  St  on  the  maximum 
synchronization  error. 

Any  time  synchronization  protocol  can  be  used  for  our 
scheme,  as  long  as  it  is  robust  against  an  active  adversary. 

As  a  proof-of-concept,  we  present  a  simple  time  syn¬ 
chronization  protocol  which  suffices  the  requirements.  The 
basic  protocol  is  as  follows: 

R  ->  S  :  Nonce 

S  -*  R:  {Sender  time  ts>  Nonce,  Interval  Rate, 
Interval  Id,  Interval  start  time, 

Interval  key,  Disclosure  Lag}K-i 
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The  receiver4  uses  a  nonce  in  its  first  packet  to  prevent 
an  attack  which  replays  a  previously  signed  synchronization 
reply.  Besides  the  current  time  ts  at  the  sender,  the  sender 
also  sends  all  information  necessary  to  define  the  intervals 
and  a  commitment  to  the  active  key  chain.  The  disclosure 
lag  defines  the  difference  in  intervals  on  when  the  key  val¬ 
ues  are  disclosed.  Finally,  the  packet  is  signed  with  a  regular 
signature  scheme. 

For  the  purposes  of  our  stream  authentication  scheme, 
the  receiver  is  only  interested  in  the  maximum  possible  time 
value  at  the  sender.  This  simplifies  the  computation.  Fig¬ 
ure  5  shows  a  timing  diagram  of  the  synchronization.  The 
receiver  sets  A t-ts~  tR  and  computes  the  latest  possible 
sender’s  time  t‘s  as  follows:  i'5  =  tfR  +  At,  where  t'R  is  the 
current  receiver’s  time,  and  t's  is  the  estimated  sender  time. 
In  the  ideal  case,  the  receiver’s  initial  packet  arrives  at  the 
sender  without  delay,  denoted  as  time  t\  in  the  figure.  The 
maximum  time  discrepancy  St  =  RTT  (round-trip  time). 

Receiver  Sender  ts 


Figure  5.  The  receiver  synchronizes  its 
time  with  the  sender. 

Scalability  is  a  major  concern  for  a  widely  deployed  sys¬ 
tem.  If  every  receiver  needs  to  synchronize  its  time  with 
the  sender,  the  sender  could  be  a  bottleneck.  A  better  solu¬ 
tion  would  use  distributed  and  secure  time  servers.  Initially, 
the  sender  synchronizes  its  time  with  the  time  server  and 
computes  the  maximum  synchronization  error  St(S).  The 
sender  would  periodically  broadcast  the  interval  informa¬ 
tion,  along  with  its  <5t(5)  and  the  current  timestamp,  dig¬ 
itally  signed  to  ensure  authenticity.  The  receivers  can  in¬ 
dependently  synchronize  their  time  to  the  synchronization 
server,  and  individually  compute  their  maximum  synchro¬ 
nization  error  St-  Finally,  the  receivers  add  up  all  the  St 
values  to  verify  the  security  condition.  Taking  this  scheme 
one  step  further,  we  could  have  a  hierarchy  of  synchroniza¬ 
tion  servers  (only  the  maximum  errors  need  to  propagate). 

4The  terms  sender  and  receiver  appear  reversed  in  the  description  of 
this  time  synchronization  protocol,  because  we  keep  their  role  with  respect 
to  the  stream  authentication  scheme.  So  it  is  the  receiver  that  synchronizes 
its  time  with  the  sender’s. 


We  could  also  imagine  synchronizing  all  the  synchroniza¬ 
tion  servers  with  a  satellite  signal,  for  example  GPS. 


Combining  with  multicast  group  control  centers.  The 
general  IP  multicast  model  assumes  that  any  host  can  join 
the  multicast  group,  receive  all  group  data,  and  send  data 
to  the  group  [11].  To  join  the  multicast  group,  the  receiver 
only  needs  to  announce  its  interest  to  a  local  router  which 
takes  care  of  forwarding  packets  to  that  receiver.  Each  join¬ 
ing  group  member  contacts  a  central  server  or  a  group  con¬ 
troller  to  negotiate  access  rights  and  session  keys.  This 
model  is  supported  by  the  Secure  Multicast  Users  Group 
(SMUG)  [29]  and  we  adopt  it  for  our  secure  authentication 
scheme,  which  requires  that  each  receiver  performs  an  ini¬ 
tial  registration  (for  time  synchronization  and  interval  tim¬ 
ing  information)  at  the  sender  or  at  a  central  server. 

Here  is  a  sketch  of  a  scalable  synchronization  mech¬ 
anism  that  uses  this  infrastructure:  Both  senders  and 
receivers  synchronize  with  time  synchronization  servers 
which  are  dispersed  in  the  network.  After  the  synchroniza¬ 
tion,  every  entity  E  knows  the  time  and  the  maximum  er¬ 
ror  St{E).  The  sender  S  periodically  broadcasts  a  signed 
message  which  contains  St(S),  along  with  the  interval  and 
key  chain  commitment  information  for  each  authentication 
chain.  A  new  receiver  R  therefore  only  need  wait  for  the 
broadcast  packet  allowing  it  to  compute  the  synchronization 
error  between  itself  and  the  sender  as  St  =  St(S)  4-  <$*(#). 
Based  on  the  St  the  receiver  determines  the  minimum-delay 
authentication  chain  it  can  use.  Hence,  the  receiver  does 
not  need  to  send  any  messages  to  the  sender,  provided  that 
the  sender  and  receiver  have  a  method  to  synchronize  and 
the  receiver  knows  the  upper  bound  of  the  synchronization 
error  St. 


Dealing  with  clock  drift.  Our  authentication  protocols 
assume  that  there  is  no  clock  drift  between  the  sender  and 
the  receiver.  In  practice,  however,  the  software  clock  can 
drift  (e.g.  under  heavy  load  when  the  timer  interrupt  does 
not  get  serviced).  Also,  an  attacker  might  be  able  to  change 
the  victim’s  time  (e.g.  by  sending  it  spoofed  NTP  mes¬ 
sages).  A  solution  to  these  problems  is  that  the  receiver  al¬ 
ways  consults  its  internal  hardware  clock,  which  has  a  small 
drift  and  which  is  hard  for  an  attacker  to  disturb.  Further¬ 
more,  the  longer  authentication  chains  in  Scheme  V  toler¬ 
ate  an  authentication  delay  on  the  order  of  tens  of  seconds, 
giving  us  a  large  security  margin.  It  is  reasonable  to  assume 
that  the  hardware  clock  does  not  drift  tens  of  seconds  within 
one  session.  Finally,  the  receiver  can  re-synchronize  peri¬ 
odically,  if  the  hardware  clock  appears  to  drift  substantially. 
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2.9  Implementation  Issues 

We  implemented  a  TESLA  prototype  in  Java.  We  use 
the  MD5  hash  function  [26]  in  conjunction  with  the  HM  AC 
construction  [4]  for  our  pseudo-random  function  and  the 
MAC.  To  limit  the  communication  overhead,  we  only  use 
the  80  most  significant  bits  of  the  output,  which  saves  space 
over  the  standard  96  bits  and  gives  sufficient  security.  The 
initial  synchronization  packet  is  signed  using  an  1024  bit 
RSA  signature  [27]. 

In  our  design,  all  of  the  functionality  for  TESLA  re¬ 
mains  in  the  application  layer.  This  design  principle  follows 
the  approach  of  ALF,  which  Tennenhouse  and  Clark  intro¬ 
duce  [9],  In  ALF,  the  application  knows  best  how  to  handle 
the  data,  as  opposed  to  placing  services  in  the  network  or 
transport  layer  of  the  OSI  stack.  ALF  is  ideally  suited  for 
TESLA.  Since  the  authentication  of  packets  is  delayed,  the 
application  knows  best  how  to  handle  unauthenticated  in¬ 
formation,  which  might  be  declared  invalid  later.  We  see 
two  main  possibilities  for  the  application  to  interact  with  a 
TESLA  module  on  the  receiver  side.  First,  we  could  buffer 
all  incoming  packets  and  deliver  them  only  after  their  au¬ 
thenticity  is  assured.  Second,  we  could  deliver  incoming 
packets  directly,  but  inform  the  application  through  an  up- 
call  as  soon  as  a  packet  is  authenticated  or  if  the  packet  is 
faulty.  We  implemented  the  second  alternative. 

On  the  other  hand,  however,  there  are  also  arguments 
for  implementing  TESLA  in  the  transport  layer,  along  with 
other  security  services  [18].  Both  variants  of  interaction 
with  the  application  are  possible.  In  the  first  case,  the  net¬ 
work  layer  buffers  the  stream  data,  and  forwards  it  as  soon 
as  the  data  authenticity  is  guaranteed.5  In  the  second  case, 
the  network  layer  would  directly  forward  the  data  to  the  ap¬ 
plication,  but  this  would  require  another  mechanism  for  the 
network  layer  to  inform  the  application  about  the  validity  of 
the  data.  To  prevent  applications  from  using  data  that  was 
not  authentic,  we  can  imagine  a  scheme  where  the  sender 
encrypts  the  data  in  each  packet  with  a  separate  key  and  re¬ 
leases  the  key  in  a  later  packet.  In  this  case,  the  application 
would  receive  the  encrypted  data,  but  could  only  use  it  after 
it  receives  the  decryption  key. 

We  use  UDP  datagrams  for  all  communication  to  simu¬ 
late  multicast  datagrams.  We  would  like  to  point  out  that 
using  a  reliable  transport  protocol  such  as  TCP  does  not 
make  sense  in  this  setting,  because  TCP  interferes  with  the 
timing  of  packet  arrival  and  does  not  announce  incoming 
packets  to  the  application  if  the  previous  packets  did  not  ar¬ 
rive.  This  is  a  problem  since  our  TESLA  module  resides  in 
the  application  space.  Furthermore,  since  TESLA  is  partic- 

5The  argumentation  against  this  method  claims  that  it  would  put  too 
much  burden  on  the  network  layer  to  buffer  data  packet.  For  the  case 
of  IP  fragmentation,  however,  the  network  layer  already  buffers  data  and 
forwards  it  to  the  application  only  when  the  entire  packet  is  complete. 


ularly  well  suited  for  lossy  data  streams,  UDP  makes  per¬ 
fect  sense,  whereas  TCP  is  used  in  settings  which  require 
reliable  communication. 

To  simplify  the  exposition  of  the  protocols,  we  consider 
the  case  of  Scheme  IV,  which  uses  one  authentication  chain 
only,  as  an  examplar. 

Sender  Tasks 

The  sender  first  needs  to  define  the  following  parameters  for 
TESLA: 

•  The  number  of  authentication  chains 

•  The  interval  rate  for  each  authentication  chain 

•  The  disclosure  delay  for  each  authentication  chain 

The  number  of  authentication  chains  is  dependent  on  the 
heterogeneity  of  network  delay  across  receivers,  the  delay 
variance,  and  the  desired  authentication  delay.  For  exam¬ 
ple,  if  we  use  TESLA  in  a  LAN  setting  with  a  small  network 
delay  and  low  delay  variance,  the  sender  can  use  one  single 
authentication  chain  with  a  disclosure  lag  of  about  one  RTT, 
which  can  be  as  low  as  a  few  milliseconds.  The  other  ex¬ 
treme  is  a  radio  broadcast  over  the  Internet  with  millions  of 
receivers.  Some  receivers  will  have  high-speed  network  ac¬ 
cess  with  a  low  delay,  others  use  dialup  modem  lines,  and 
yet  others  might  be  connected  through  a  wireless  link  with 
considerable  delay,  which  can  be  on  the  order  of  seconds. 
To  accommodate  the  latter  category,  which  might  also  have 
a  large  synchronization  error  on  the  order  of  seconds,  the 
longest  authentication  chain  needs  to  have  an  disclosure  de¬ 
lay  as  long  as  15  to  30  seconds.  Such  a  long  delay  is  not 
acceptable  to  the  high-speed  users.  A  second  authentication 
chain  with  a  small  disclosure  delay  around  1-2  seconds  is 
appropriate.  To  close  the  wide  gap  between  the  high-end 
and  the  low-end  users,  a  third  chain  with  a  delay  of  5  to  10 
seconds  will  appeal  to  the  modem  users. 

Initially,  the  sender  picks  a  random  key  Kn  and  com¬ 
putes  and  stores  the  entire  chain  of  keys  Ki  =  J F(/fi+1). 

Receiver  Tasks 

The  receiver  initially  synchronizes  with  the  sender  and  de¬ 
termines  the  accuracy  St.  The  sender  also  sends  all  interval 
information  and  the  disclosure  lag  to  the  receiver,  which 
is  necessary  to  verify  the  security  condition.  The  authenti¬ 
cated  synchronization  packet  also  contains  a  disclosed  key 
value,  which  is  a  commitment  to  the  key  value  chain. 

For  each  incoming  packet,  the  receiver  first  verifies  the 
security  condition.  It  then  checks  whether  the  disclosed 
key  value  is  correct,  which  can  be  verified  by  applying  the 
HMAC-MD5  (our  pseudo-random  function)  until  it  can  ver¬ 
ify  equality  with  a  previously  authenticated  commitment. 
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Block  size 

16 

64 

256 

1024 

MD5 

256410 

169491 

22075 

HMAC-MD5 

75187 

6535SM 

17605 

Table  1 .  Performance  of  primitives  of  the 
Cryptix  native  Java  library.  The  perfor¬ 
mance  is  displayed  in  the  number  of  op¬ 
erations  per  second. 


Packet  size  (bytes) 

64 

256 

1024 

One  authentication  chain 

27677 

23009 

8148 

Two  authentication  chains 

19394 

14566 

7402 

Three  authentication  chains 

14827 

13232 

6561 

Four  authentication  chains 

12653 

11349 

5914 

Table  2.  Performance  of  our  packet  au¬ 
thentication  scheme  for  a  varying  number 
of  authentication  chains.  All  performance 
numbers  are  in  packets  per  second. 


To  minimize  the  computation  overhead,  the  receiver  recon¬ 
structs  and  stores  the  chain  of  key  values.  Since  the  MAC 
cannot  be  verified  at  this  time,  the  receiver  adds  the  triplet 
(Packet  Hash,  Interval,  MAC  value)  to  the  list  of  packets  to 
be  verified,  sorted  by  interval  value.  Instead  of  storing  the 
entire  packet,  the  receiver  computes  and  stores  only  the  hash 
value  of  the  packet.  If  the  incoming  disclosed  MAC  key  was 
new,  the  receiver  updates  the  key  chain  and  checks  whether 
it  can  verity  the  MAC  of  any  packets  on  the  packet  list.  In 
the  case  a  MAC  does  not  verify  correctly,  the  library  throws 
an  exception  to  warn  the  application.  Finally,  the  packet  is 
delivered  to  the  application. 

A  possible  denial-of-service  attack  is  an  attacker  sending 
a  packet  marked  as  being  from  an  interval  far  in  the  future. 
A  receiver  would  then  spend  much  time  to  update  its  key 
chain.  A  simple  remedy  against  this  attack  would  be  for  the 
receiver  to  reject  packets  if  they  could  not  have  been  sent 
yet  (along  the  lines  of  the  security  condition). 

A  drawback  of  this  stream  authentication  scheme  is  that 
each  receiver  needs  to  store  the  key  chain  and  packet  in¬ 
formation  to  verify  the  packet  authenticity.  While  the  key 
chain  is  small  (since  only  a  few  intervals  per  seconds  are 
used  in  practice),  the  amount  of  storage  required  can  be 
large  for  long  authentication  delays  and  fast  sender  rates. 
In  our  implementation,  only  the  80  bit  hash  and  the  interval 
are  stored  per  packet,  which  amounts  to  12  bytes. 

Performance 

For  each  outgoing  packet,  the  sender  only  needs  to  compute 
one  HMAC  function  per  packet  per  authentication  chain, 
since  the  key  chain  can  be  pre-computed.  Table  1  shows 
the  performance  of  the  MD5,  and  HMAC-MD5  functions 
provided  by  Cryptix  [10]  running  on  a  550  MHz  Pentium  III 
Linux  PC.  The  Java  code  was  executed  by  the  JIT  compiler 
which  comes  with  the  JDK  1.1.8  provided  by  IBM  [17]. 

We  analyze  the  performance  of  our  stream  authentica¬ 
tion  scheme  by  measuring  the  number  of  packets  per  second 
that  a  sender  can  create.  Table  2  shows  the  packet  rates  for 
different  packet  sizes  and  different  numbers  of  authentica¬ 
tion  chains.  We  suspect  that  an  optimized  C  implementation 
might  be  at  least  twice  as  fast. 


The  communication  overhead  of  our  prototype  is  24 
bytes  per  authentication  chain.  Since  we  use  80  bit  HMAC- 
MD5,  both  the  disclosed  key  and  the  MAC  are  10  bytes 
long.  The  remaining  four  bytes  are  used  to  send  the  interval 
index. 

Also,  the  overhead  of  pre-computing  the  key  chain  is 
minimal.  In  our  experiments  we  use  an  interval  length 
of  1  /10th  of  a  second.  To  pre-compute  a  key  chain  long 
enough  to  authenticate  packets  for  one  hour,  the  sender  pre- 
computation  time  is  only  36000/74626  «  0.5  seconds. 

The  computational  overhead  on  the  receiver  side  is  the 
same  as  on  the  sender  side,  except  that  the  receiver  needs  to 
recompute  the  key  chain  while  the  sender  can  pre-compute 
it.  However,  the  overhead  of  computing  the  key  chain  is 
negligible,  since  it  involves  computing  one  HMAC  func¬ 
tions  in  each  time  interval,  and  in  practice  only  tens  of  in¬ 
tervals  are  used  per  second. 

3  EMSS:  Efficient  Multi-chained  Stream  Sig¬ 
nature 

TESLA  does  not  provide  non-repudiation.  Most  multi- 
media  applications  do  not  need  non-repudiation  since  they 
discard  the  data  after  it  is  decoded  and  played.  Stream  sig¬ 
nature  schemes  are  still  important,  however,  for  the  fol¬ 
lowing  two  cases.  First,  some  applications  really  do  need 
continuous  non-repudiation  of  each  data  packet,  but  we 
could  not  find  a  compelling  example.  Second,  and  more 
importantly,  in  settings  where  time  synchronization  is  dif¬ 
ficult,  TESLA  might  not  work.  We  present  EMSS  (Ef¬ 
ficient  Multi-chained  Stream  Signature),  to  achieve  non¬ 
repudiation  which  also  achieves  sender  authentication. 

The  requirements  for  our  stream  signature  scheme  are  as 
follows: 

•  Non-repudiation  for  each  individual  packet 

•  Continuous  non-repudiation  of  packets 

•  Robust  against  high  packet  loss 
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•  Low  computation  and  communication  overhead 

•  Real-time  stream  content 

•  No  buffering  of  packets  at  the  sender  required 

3.1  Our  Basic  Signature  Scheme 

To  achieve  non-repudiation,  we  rely  on  a  conventional 
signature  scheme,  for  example  RSA  [27]  or  Rohatgi’s  k- 
times  signature  scheme  [28].  Unfortunately,  the  compu¬ 
tation  and  communication  overhead  of  current  signature 
schemes  is  too  high  to  sign  every  packet  individually.  To 
reduce  the  overhead,  one  signature  needs  to  be  amortized 
over  multiple  packets. 

Our  basic  solution  bases  on  the  following  scheme  to 
achieve  non-repudiation  of  a  sequence  of  packets.  Packet  Pi 
includes  a  hash  of  the  previous  packet  P*_  j.  By 

sending  a  signature  packet  at  the  end  of  the  stream,  which 
contains  the  hash  of  the  final  packet  along  with  a  signature, 
we  achieve  non-repudiation  for  all  packets.  To  achieve  ro¬ 
bustness  against  packet  loss,  each  packet  contains  multiple 
hashes  of  previous  packets,  and  furthermore,  the  final  sig¬ 
nature  packet  signs  the  hash  of  multiple  packets.  Figure  6 
shows  an  example,  where  each  packet  contains  the  hash  of 
the  two  previous  packets,  and  where  the  signature  packet 
contains  the  hash  of  the  last  two  packet  and  the  signature. 


Pi  Pi+ 1  2 


Figure  6.  We  achieve  non-repudiation  through  peri¬ 
odic  signature  packets,  which  contain  the  hash  of  sev¬ 
eral  data  packets,  and  the  inclusion  of  the  hash  of  the 
current  packet  within  future  packets.  The  inclusion 
of  multiple  hashes  achieves  robustness  against  packet 
loss. 

In  order  for  the  sender  to  continuously  verify  the  signa¬ 
ture  of  the  stream,  the  sender  sends  periodic  signature  pack¬ 
ets.  Since  the  receiver  can  only  verify  the  signature  of  a 
packet  after  it  receives  the  next  signature  packet,  it  is  clear 


that  the  receiver  experiences  a  delay  until  packet  verifica¬ 
tion. 

To  simplify  the  following  discussion,  we  describe  this 
scheme  as  a  graph  problem  and  use  the  corresponding  ter¬ 
minology.  Namely,  we  use  the  term  node  instead  of  packet, 
and  edge  instead  of  hash  link.  We  define  the  length  of  an 
edge  as  L(Eij)  =  \i  -  j |,  where  i  and  j  are  the  id’s  of 
the  corresponding  nodes.  If  packet  Pj  contains  the  hash  of 
packet  Pi,  we  draw  a  directed  edge  starting  at  Pj  to  Pj.  We 
call  Pj  a  supporting  packet  of  P*.  Similarly,  an  edge  points 
from  a  packet  Pk  to  a  signature  packet  S/,  if  Si  contains  the 
hash  of  Pk .  We  assume  that  some  of  the  packets  are  dropped 
between  the  sender  and  the  receiver.  All  nodes  which  cor¬ 
respond  to  dropped  packets  are  removed  from  the  graph.  A 
packet  Pi  is  verifiable ,  if  there  exists  a  path  from  Pi  to  any 
signature  packet  Sj. 

This  stream  signature  scheme  has  the  following  parame¬ 
ters: 

•  Number  of  edges  per  node 

•  Length  and  distribution  of  edges 

•  Frequency  of  signature  nodes 

•  Number  and  distribution  of  incoming  edges  in  signa¬ 
ture  nodes 

These  parameters  influence  the  computation  and  communi¬ 
cation  overhead,  the  delay  until  verification,  and  the  robust¬ 
ness  against  packet  loss.  We  want  to  achieve  low  overhead 
while  retaining  high  robustness  against  packet  loss  and  a 
low  verification  delay. 

To  simplify  the  problem  of  optimizing  all  parame¬ 
ters  simultaneously,  we  first  focus  on  the  interplay  be¬ 
tween  the  number  and  distribution  of  edges  to  achieve 
high  robustness  against  packet  loss.  We  first  consider 
static  edges,  which  means  that  all  the  outgoing  and  in¬ 
coming  edges  of  each  node  have  predefined  lengths.  For 
example,  in  a  “1-3-7”  scheme,  the  node  Pi  has  outgo¬ 
ing  edges  to  Pi+i,Pi+3,Pi+7,  and  incoming  edges  from 
Pi_l,Pj— 3,Pi_7. 

To  simplify  the  problem  even  further,  we  initially  assume 
independent  packet  loss,  i.e.  each  packet  has  an  equal  loss 
probability.6 

Instead  of  computing  the  probability  precisely  for  each 
node,  we  wrote  a  program  to  perform  simulations.  We 

6Our  first  attempt  was  to  devise  an  analytical  formula  to  model  the 
probability  for  each  node  that  it  is  connected  to  a  signature  node.  Un¬ 
fortunately,  finding  an  exact  formula  is  harder  than  it  first  appears,  so  de¬ 
riving  the  analytical  formula  automatically  for  a  given  edge  distribution 
remains  an  open  problem.  We  illustrate  this  complexity  with  an  exam¬ 
ple  for  the  recurrence  relation  which  describes  the  simple  1-2-4  scheme: 
P[N-il  =  (l-9).(P[W-i+l]+9P[W-t  +  2]  +  (2-9)}2P[N- 
«+4]  —  (1  —  q)2q*P[N  —  i  + 5]),  where  P[i]  is  the  probability  that  node 
» is  connected  to  node  N  which  is  signed,  and  q  is  the  probability  that  the 
node  is  dropped. 
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checked  the  accuracy  of  the  simulation  program  on  the 
cases  for  which  we  computed  an  analytical  solution: 
l_2-4  and  1-2-3  -  4.  Our  simulation  (with  2500 
samples  simulating  up  to  1000  packets  before  the  signature 
packet)  had  an  absolute  error  of  less  than  ±2%  of  the  veri¬ 
fication  probability  for  these  two  cases. 

We  ran  extensive  simulations  to  find  a  good  distribution 
of  edges  withstanding  high  amounts  of  dropped  nodes.  In 
our  largest  simulation,  we  searched  through  all  combina¬ 
tions  of  six  edges  per  node,  where  the  maximum  length  of 
any  edge  was  51,  and  the  probability  of  dropping  a  node 
was  60%.7  In  our  simulation,  we  assumed  that  the  final 
seven  nodes  all  existed  and  that  they  all  contained  an  edge 
to  the  signature  node. 

The  simulation  results  were  illuminating.  The  most  im¬ 
portant  finding  from  the  simulation  study  is  that  the  ma¬ 
jority  of  combinations  are  robust.  Figure  8  illustrates  this 
point.  The  x-axis  ranges  over  the  average  probability  of  ver¬ 
ification  p.  The  figure  shows  how  many  combinations  had 
that  average  verification  probability  p,  measured  over  400 
nodes  preceding  the  signature  packet.  The  figure  demon¬ 
strates  that  most  of  the  combinations  have  high  robustness. 
In  fact,  99%  of  all  combinations  give  an  average  verifica¬ 
tion  probability  over  90%.  This  finding  motivates  the  use  of 
random  edges  instead  of  static  edges. 

Another  interesting  result  is  that  the  continuous  case  1  — 
2  _  3  ~  4  —  5  —  6  is  the  weakest  combination,  and  that 
exponentially  increasing  edge  lengths  1—2—4—8  —  16—32 
had  poor  robustness.  One  of  the  strongest  combinations  is 
5  -  11  -  17  -  24  -  36  -  39.  We  show  the  performance  of 
these  three  combinations  in  figure  7.  The  continuous  case 
has  the  lowest  verification  probability,  the  exponential  chain 
is  already  much  better,  and  the  last  case  does  not  seem  to 
weaken  as  the  distance  from  the  signature  packet  increases. 

The  assumption  of  independent  packet  loss  does  not  hold 
in  the  Internet.  Many  studies  show  that  packet  loss  is  cor¬ 
related,  which  means  that  the  probability  of  loss  is  much 
higher  if  the  previous  packet  is  lost.  Paxson  shows  in  one 
of  his  recent  studies  that  packet  loss  is  correlated  and  that 
the  length  of  losses  exhibit  infinite  variance  [24].  Borella  et 
al.  draw  similar  conclusions,  furthermore  they  find  that  the 
average  length  of  loss  bursts  is  about  7  packets  [6]. 

Yajnik  et  al.  show  that  a  fc-state  Markov  model  can  model 
Internet  packet  loss  patterns  [32].  For  our  simulation  pur¬ 
poses,  the  two-state  model  is  sufficient,  since  it  can  model 
simple  patterns  of  bursty  loss  well  [16,  32].  The  main  ad¬ 
vantage  of  randomizing  the  edges,  however,  is  visible  when 
we  consider  correlated  packet  loss.  Figure  9  shows  a  sim¬ 
ulation  with  60%  packet  loss  and  where  the  average  length 
of  a  burst  loss  is  10  packets.  We  can  clearly  see  in  the  fig- 

7We  chose  to  use  six  edges  per  node,  because  we  wanted  to  achieve  a 
high  average  robustness  for  the  case  of  60%  packet  loss  and  with  only  five 
edges  did  not  give  us  a  high  verification  probability. 


Figure  7.  The  verification  probability  for  three  static 
cases:  Topline:  5-11-17-24-36-39.  Middle  line:  1-2- 
4-8-16-32.  Bottom  line:  1-2-3-4-5-6. 


Average  Probabiiity(Packet  is  verifiable) 


Figure  8.  Number  of  combinations  of  six  hashes  that 
resulted  in  a  given  average  verification  probability. 
Note  that  we  assume  a  60%  packet  loss  probability. 


ure  that  the  verification  probability  of  the  static  edge  scheme 
drops  exponentially,  whereas  the  random  edges  still  provide 
a  high  verification  probability. 
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Figure  9.  The  verification  probability  for  random  vs 
a  static  case.  Top  line  is  random  link  distribution. 
Bottom  line  is  5-1 1-17-24-36-39. 


3.2  The  Extended  Scheme 

The  basic  scheme  has  a  lot  of  redundancy.  All  the  sup¬ 
porter  packets  carry  the  same  hash  value  of  a  given  packet. 
In  the  experiments  we  use  six  hashes  per  packet,  hence  six 
packets  carry  the  same  hash  value.  Removing  this  redun¬ 
dancy  might  give  us  a  lower  communication  overhead  and 
improved  robustness  against  loss. 

The  core  idea  is  to  split  the  hash  into  k  chunks,  where  a 
quorum  of  any  kf  chunks  is  sufficient  to  allow  the  receiver 
to  validate  the  information.  One  approach  is  to  use  Rabin’s 
Information  Dispersal  Algorithm  [25],  which  has  precisely 
this  property.  Another  approach  is  to  produce  a  hash  func¬ 
tion  with  a  large  number  of  independent  bits,  but  only  look 
at  a  limited  number  of  those  bits.  This  can  most  easily  be 
realized  by  a  family  of  universal  hash  functions  [8]. 

The  main  advantage  of  this  scheme  is  that  any  k*  out  of 
the  k  packets  need  to  arrive,  which  has  a  higher  robustness 
in  some  circumstances  than  receiving  1  packet  out  of  d  in 
the  basic  scheme.  For  example,  if  we  use  the  basic  scheme 
with  80-bit  hashes  and  six  hashes  per  packet,  the  communi¬ 
cation  overhead  is  at  least  60  bytes,  and  the  probability  that 
at  least  one  out  of  six  packets  arrives  is  1  -  g6,  where  q  is  the 
loss  probability.  In  contrast,  if  we  use  the  extended  scheme 
with  a  hash  of  480  bits,  chunks  of  16  bits,  k  =  30,  and 
kf  =  5,  the  probability  that  the  receiver  gets  more  than  four 
packets  is  1  -  (3?)  •  g30-'  •  (1  -  q)\  Clearly,  the  lat- 

ter  probability  is  much  higher.  Although  both  probabilities 


only  provide  an  upper  bound  on  the  verification  probabil¬ 
ity,  it  still  gives  an  intuition  on  why  the  extended  scheme 
provides  higher  robustness  to  packet  loss. 

The  simulation  confirmed  these  findings.  The  extended 
scheme  outperforms  the  basic  scheme  in  robustness  against 
packet  loss.  Figure  10  shows  a  comparison  of  the  two 
schemes  with  identical  communication  overhead. 


Distance  to  final  signature  packet 

Figure  10.  The  verification  probability  for  the  ba¬ 
sic  vs.  the  extended  scheme.  Top  line  is  the  extended 
scheme.  Bottom  line  is  the  basic  scheme. 


3.3  Signature  Packets 

An  important  requirement  of  our  scheme  signature 
scheme  is  that  the  receiver  can  continuously  verify  the  sig¬ 
nature  of  packets.  Clearly,  the  receiver  can  only  verify  the 
signature  once  it  can  trace  the  authentication  links  to  a  sig¬ 
nature  packet.  Hence,  the  verification  delay  depends  on  the 
frequency  and  the  transmission  reliability  of  signature  pack¬ 
ets.  The  signature  packet  rate  depends  on  the  available  com¬ 
putation  and  communication  resources.  If  we  use  1024-bit 
RSA  signatures,  a  dedicated  server  can  compute  on  the  or¬ 
der  of  100  signatures  per  second.  The  corresponding  com¬ 
munication  overhead  is  128  bytes  for  the  signature  plus  10 
bytes  for  each  hash  included. 

We  also  performed  simulations  with  signature  packets. 
The  parameters  included  the  signature  rate,  the  loss  proba¬ 
bility  of  signature  packets,8  and  the  number  of  hashes  per 
signature  packet.  Figure  1 1  shows  the  sawtooth-shaped 

8The  loss  probability  might  be  different  for  signature  packets  if  they 
are  sent  redundantly  or  in  a  higher  service  class  in  the  context  of  QoS. 
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verification  probability  for  a  stream  with  10%  packet  loss 
(bursty  loss),  the  average  burst  length  of  dropped  packets  is 
10,  the  hash  is  split  up  into  9  chunks  of  27  bits  each  (span¬ 
ning  a  maximum  length  of  100  packets),  hence  3  chunks 
are  necessary  to  verify  a  packet,  which  gives  us  81  bits  of 
the  signature.  The  communication  overhead  per  packet  is 
therefore  about  35  bytes  per  packet.  The  signature  packets 
are  sent  every  250  packets  and  they  contain  80-bit  hashes  of 
40  packets,  and  one  1024-bit  RSA  digital  signature  which 
amounts  to  128  bytes.  Each  signature  packet  is  sent  twice, 
so  the  loss  probability  of  a  signature  packet  is  reduced  to 
1%.  The  average  per-packet  overhead  in  this  case  is  40 
bytes. 


Distance  to  final  signature  packet 


Figure  11.  The  verification  probability  for  the  ex¬ 
tended  scheme  including  periodic  signature  packets. 


3.4  Case  Study  on  Two  Settings 

We  consider  two  different  cases  of  stream  distribution 
and  we  analyze  the  overhead  of  applying  EMSS  to  ensure 
the  non-repudiation  of  the  streamed  data. 

Case  I:  Streamed  Distribution  of  TVaffic  Data 

Assume  that  a  municipality  has  traffic  sensors  distributed 
over  streets.  It  broadcasts  this  data  over  the  Internet  so  citi¬ 
zens  (and  robot  driven  vehicles)  can  improve  their  trip  plan¬ 
ning.  The  system  requirements  are  as  follows: 

•  The  data  rate  of  the  stream  is  about  8  Kbps,  about  20 
packets  of  64  bytes  each  are  sent  every  second. 


•  The  packet  drop  rate  is  at  most  5%,  where  the  average 
length  of  burst  drops  is  5  packets. 

•  The  verification  delay  should  be  less  than  10  seconds. 

Many  different  instantiations  of  EMSS  result  in  efficient 
schemes  which  satisfy  these  requirements.  The  following 
scheme  offers  low  overhead  with  high  verification  proba¬ 
bility.  Each  packet  has  two  hashes,  and  the  length  of  each 
hash  chain  element  is  chosen  uniformly  distributed  over  the 
interval  [1, . . .  ,50].  Each  hash  is  80  bits  long,  hence,  only 
one  hash  is  necessary  for  verification.  A  signature  packet 
is  sent  every  100  packets,  or  every  five  seconds,  which  is 
not  necessary  to  achieve  robustness  in  this  case,  but  to  en¬ 
sure  that  the  verification  delay  is  less  than  ten  seconds,  with 
high  probability.  Each  signature  packet  carries  the  hash  of 
five  data  packets.  The  simulation  predicts  an  average  verifi¬ 
cation  probability  per  packet  of  98.7%. 

The  computation  overhead  is  minimal.  The  sender  only 
needs  to  compute  one  signature  every  five  seconds,  and  only 
20  hash  functions  per  second.  The  communication  overhead 
is  low  also.  Each  data  packet  carries  20  bytes  containing  the 
hash  of  two  previous  packets.9  The  signature  packet  con¬ 
tains  five  hashes  and  a  signature,  and  its  length  is  hence  50 
bytes  plus  the  signature  length.  Assuming  a  1024  bit  RSA 
signature,  the  signature  packet  is  178  bytes  long.  The  aver¬ 
age  per-packet  overhead  is  therefore  about  22  bytes,  which 
is  much  lower  than  previous  schemes,  which  we  review  in 
section  4. 

Case  II:  Real-time  Video  Broadcast 

Assume  we  want  to  broadcast  signed  video  on  the  Inter¬ 
net.  The  system  requirements  are  as  follows: 

•  The  data  rate  of  the  stream  is  about  2  Mbps,  about  512 
packets  of  512  bytes  each  are  sent  every  second. 

•  Some  clients  experience  packet  drop  rates  up  to  60%, 
where  the  average  length  of  burst  drops  is  10  packets. 

•  The  verification  delay  should  be  less  than  1  second. 

The  high  packet  drop  rate  makes  it  difficult  for  signa¬ 
ture  packets  to  reach  the  receiver.  To  increase  the  likeli¬ 
hood  of  signature  packets  to  arrive,  we  send  them  twice  — 
but  within  a  delay,  since  packet  loss  is  correlated.  If  we 
approximate  the  loss  probability  by  assuming  the  signature 

9The  packet  id’s  of  the  packet  do  not  need  to  be  stored  in  the  packet 
for  two  reasons.  Since  the  probability  of  a  hash  collision  is  negligible,  the 
receiver  can  store  the  hash  of  the  last  50  data  packets  it  received.  If  any 
packet  contains  the  same  hash  value,  we  consider  that  packet  as  verified, 
if  the  current  packet  can  be  verified.  Alternatively,  we  could  build  a  deter¬ 
ministically  computable  random  graph  over  the  packets,  and  the  receiver 
would  reconstruct  it.  This  alternative  would  require  a  packet  id  in  each 
packet. 
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packet  losses  are  uncorrelated  if  they  are  sent  within  a  de¬ 
lay,  the  probability  that  one  of  them  arrives  is  approximately 
1  -  0.62  =  0.64.  Since  the  packet  loss  is  so  high  and  veri¬ 
fication  delay  relatively  short,  we  send  a  a  signature  packet 
every  200  packets.  This  translates  to  about  2.5  signatures 
per  second,  which  we  consider  as  a  low  computational  over¬ 
head.  We  assume  that  the  signature  packets  have  about  the 
same  size  as  the  data  packets,  so  in  512  bytes  we  can  fit  one 
1024-bit  RSA  signature  and  the  80  bit  hash  of  40  previous 
packets. 

We  chose  these  parameters  based  on  good  engineering 
practice.  To  find  better  parameters  for  the  number  of  chunks 
that  the  hash  is  split  into  and  the  number  of  chunks  required 
to  verify  the  packet,  we  used  a  simulation.  The  simulation 
shows  that  the  best  combination  for  this  case  uses  50  bytes 
per  packet  to  insert  25  chunks  of  two  bytes  of  the  hash  of 
previous  packets.  Including  the  signature  packets,  the  aver¬ 
age  communication  overhead  is  about  55  bytes  per  packet. 
The  simulation  predicts  the  average  verification  probability 
over  the  final  2000  packets  of  97%,  with  the  minimum  ver¬ 
ification  probability  90%. 

4  Previous  Work 

We  review  previous  art  which  deals  with  the  problem  of 
continuous  authentication  and  signature  of  streams. 

Gennaro  and  Rohatgi  introduced  techniques  for  signing 
digital  streams  [13].  They  present  two  different  schemes, 
one  for  the  off-line  case  (the  entire  stream  content  is  known 
in  advance)  and  the  other  for  the  on-line  case  (the  stream 
content  is  generated  in  real-time).  For  the  off-line  case,  they 
suggest  signing  the  first  packet  and  embeding  in  each  packet 
Pi  the  hash  of  the  next  packet  Pi+1  (including  the  hash 
stored  in  Pi+ j).  While  this  method  is  elegant  and  provides 
for  a  stream  signature,  it  does  not  tolerate  packet  loss.  The 
biggest  disadvantage,  however,  is  that  the  entire  stream  of 
packets  needs  to  be  known  in  advance.  The  on-line  scheme 
solves  this  problem  through  a  regular  signature  of  the  initial 
packet  and  embedding  the  public  key  of  a  one-time  signa¬ 
ture  in  each  packet,  which  is  used  to  sign  the  subsequent 
packet.  The  limitation  is  again  that  this  scheme  is  not  ro¬ 
bust  against  packet  loss.  In  addition,  the  one-time  signature 
communication  overhead  is  substantial. 

Wong  and  Lam  address  the  problem  of  data  authenticity 
and  integrity  for  delay-sensitive  and  lossy  multicast  flows 
[31].  They  propose  to  use  Merkle’s  signature  trees  to  sign 
streams.  Their  idea  to  make  asymmetric  digital  signatures 
more  efficient  is  to  amortize  one  signature  generation  and 
verification  over  multiple  messages.  Merkle  describes  how 
to  construct  a  hash  tree  over  all  messages  where  the  signer 
only  digitally  signs  the  root  [20,  21],  However,  to  make 
this  scheme  robust  against  packet  loss,  every  packet  needs 
to  contain  the  signature  along  with  all  the  nodes  necessary 


to  compute  the  root,  which  requires  large  space  overhead. 
In  practice,  this  scheme  adds  around  200  bytes  to  each 
packet  (assuming  a  1024  bit  RSA  signature  and  a  signa¬ 
ture  tree  over  16  packets).  Another  shortcoming  is  that  all 
messages  need  to  be  known  to  compute  the  signature  tree. 
This  causes  delays  on  the  sender  side.  Furthermore,  after 
the  signature  computation,  all  packets  are  sent  at  the  same 
time,  causing  bursty  traffic  patterns.  This  burstiness  may 
increase  the  packet  drop  rate  in  the  network.  Although  the 
computational  overhead  is  amortized  over  multiple  packets, 
there  is  still  a  substantial  amount  of  computation  necessary 
for  signature  verification,  which  can  consume  a  substan¬ 
tial  amount  of  resources  on  low-end  receivers  (for  example 
battery  power).  A  subtle  point  is  that  the  per-packet  com¬ 
putation  increases  with  the  packet  loss  rate.  Since  mobile 
receivers  also  have  less  computational  power  and  higher 
packet  loss,  the  benefit  of  the  amortization  is  lost.  The 
schemes  which  we  propose  in  this  paper  solve  these  short¬ 
comings. 

Rohatgi  presents  a  new  scheme  which  reduces  the  sender 
delay  for  a  packet,  and  which  reduces  the  communication 
overhead  of  one-time  signatures  over  previously  proposed 
schemes  [28].  He  introduces  a  A;-time  signature  scheme, 
which  is  more  space  efficient  than  the  one-time  signatures. 
Despite  all  advantages,  the  scheme  still  uses  90  bytes  for 
a  6-time  public  key  (which  does  not  include  the  certificate 
of  the  public  key)  and  300  bytes  for  each  signature.  Also, 
the  server  requires  350  off-line  hash  function  applications 
and  the  client  needs  184  hashes  on  average  to  verify  the 
signature. 

Canetti  et  al.  construct  a  sender  authentication  scheme 
for  multicast  [7].  Their  solution  is  to  use  k  different  keys 
to  authenticate  every  message  with  k  different  MAC’s.  Ev¬ 
ery  receiver  knows  m  keys  and  can  hence  verify  m  MAC’s. 
The  keys  are  distributed  in  such  a  way  that  no  coalition  of 
w  receivers  can  forge  a  packet  for  a  specific  receiver.  The 
communication  overhead  for  this  scheme  is  considerable, 
since  every  message  carries  k  MAC’s.  The  server  must  also 
compute  k  MACs  before  a  packet  is  sent,  which  makes  it 
more  expensive  than  the  scheme  we  present  in  this  paper. 
Furthermore,  the  security  of  their  scheme  depends  on  the 
assumption  that  at  most  a  bounded  number  (which  is  on  the 
order  of  k)  of  receivers  collude. 

Syverson,  Stubblebine,  and  Goldschlag  propose  a  sys¬ 
tem  which  provides  asymmetric  and  unlinkable  authentica¬ 
tion  [30].  In  their  system,  a  client  proves  its  right  to  ac¬ 
cess  the  vendor’s  service  through  a  blinded  signature  token, 
which  is  renewed  on  each  transaction.  Through  the  vendor’s 
blind  signature,  they  achieve  unlinkability  of  transactions. 
This  scheme  would  not  work  for  stream  authentication,  be¬ 
cause  the  communication  and  computation  overhead  is  sub¬ 
stantial.  Furthermore,  the  scheme  provides  unlinkability, 
which  is  not  needed  for  authenticating  multicast  streams. 
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Anderson  et  al.  [1]  present  a  scheme  which  provides 
stream  authentication  between  two  parties.  Their  Guy 
Fawkes  protocol  has  the  following  packet  format: 

where  M*  denotes  message  i,  X{  stands  for  a  random 
number,  and  h  is  a  hash  function.  Assuming  that  the 
receiver  received  an  authentication  packet  Pi,  it  can 
immediately  authenticate  the  following  packet  Pi+ 1,  since 
Pi  contains  the  commitment  h(Mi+i, Xi+i,h(Xi+2))  to 
Pi+l.  Similarly,  Pw  comes  with  a  commitment  for  Pi+2. 

A  drawback  of  this  protocol  is  that  to  send  message  Mu  the 
following  message  needs  to  be  known.  Furthermore, 
this  scheme  cannot  tolerate  any  packet  loss.  They  propose 
two  methods  to  guarantee  that  the  keys  are  not  revealed  too 
soon.  The  first  method  is  that  the  sender  and  receiver  are 
in  lockstep,  i.e.  the  receiver  acknowledges  every  packet 
before  the  sender  can  send  the  next  packet.  This  severely 
limits  the  transfer  time  and  does  not  scale  to  a  large 
number  of  receivers.  The  second  method  to  secure  their 
scheme  is  to  time-stamp  each  packet  at  a  time-stamping 
service,  which  introduces  additional  complexity.  The  Basic 
authentication  scheme  I  we  propose  in  this  paper  is  similar 
to  the  Guy  Fawkes  protocol.  We  improve  on  Guy  Fawkes 
and  construct  an  efficient  stream  authentication  scheme 
without  these  limitations. 

We  understand  that  unpublished  work  by  Bob  Briscoe 
at  BT  research,  and  Dan  Boneh  and  Philippe  Golle,  has 
been  proceeding  along  some  similar  lines.  To  the  best  of 
our  knowledge,  all  of  these  groups  have  been  working  inde¬ 
pendently. 

5  Acknowledgments 

We  would  like  to  thank  Pankaj  Rohatgi  for  his  help  dur¬ 
ing  the  early  stages  of  the  project.  We  would  also  like  to 
thank  Steve  Glassman,  Mark  Manasse,  and  Allan  Heydon 
for  their  helpful  comments  and  discussions.  We  are  also  in¬ 
debted  to  David  Wagner  and  Bob  Briscoe  for  their  relevant 
feedback  and  concrete  suggestions  on  how  to  improve  the 
presentation  of  this  work.  Finally,  we  thank  the  anonymous 
reviewers  for  their  helpful  suggestions. 

References 

[1]  Ross  J.  Anderson,  Francesco  Bergadano,  Bruno 
Crispo,  Jong-Hyeon  Lee,  Charalampos  Manifavas, 
and  Roger  M.  Needham.  A  new  family  of  authenti¬ 
cation  protocols.  Operating  Systems  Review ,  32(4):9- 
20,  October  1998. 

[2]  M.  Bellare,  J.  Kilian,  and  P.  Rogaway.  The  security  of 
cipher  block  chaining.  In  Yvo  Desmedt,  editor,  Ad¬ 
vances  in  Cryptology  -  Crypto  ' 94 ,  pages  341-358, 


Berlin,  1994.  Springer- Verlag.  Lecture  Notes  in  Com¬ 
puter  Science  Volume  839. 

[3]  M.  Bellare  and  P.  Rogaway.  Collision-resistant  hash¬ 
ing:  Towards  making  UOWHFs  practical.  In  Burt 
Kaliski,  editor.  Advances  in  Cryptology  -  Crypto  ' 97 , 
pages  470-484,  Berlin,  1997.  Springer- Verlag.  Lec¬ 
ture  Notes  in  Computer  Science  Volume  1294. 

[4]  Mihir  Bellare,  Ran  Canetti,  and  Hugo  Krawczyk. 
Message  Authentication  using  Hash  Functions  —  The 
HMAC  Construction.  RSA  Laboratories  Crypto  Bytes, 
2(1),  Spring  1996. 

[5]  Matt  Bishop.  A  Security  Analysis  of  the  NTP  Pro¬ 
tocol  Version  2.  In  Sixth  Annual  Computer  Security 
Applications  Conference ,  November  1990. 

[6]  M.  Borella,  D.  Swider,  S.  Uludag,  and  G.  Brewster. 
Internet  packet  loss:  Measurement  and  implications 
for  end-to-end  qos.  In  International  Conference  on 
Parallel  Processing,  August  1998. 

[7]  Ran  Canetti,  Juan  Garay,  Gene  Itkis,  Daniele  Miccian- 
cio,  Moni  Naor,  and  Benny  Pinkas.  Multicast  security: 
A  taxonomy  and  some  efficient  constructions.  In  Info- 
corn  '99, 1999. 

[8]  J.  L.  Carter  and  M.  N.  Wegman.  Universal  classes  of 
hash  functions.  JCSSNo .  18,  (18):  143-1 54, 1979. 

[9]  D.  D.  Clark  and  D.  L.  Tennenhouse.  Architectural 
considerations  for  a  new  generation  of  protocols.  In 
Proceedings  of  the  ACM  symposium  on  Communi¬ 
cations  architectures  and  protocols  S1GCOMM  ' 90 , 
pages  200-208,  September  26-28  1990. 

[10]  Cryptix.  http://www.cryptix.org. 

[11]  Stephen  E.  Deering.  Multicast  Routing  in  Internet¬ 
works  and  Extended  LANs.  In  Proceedings  of  ACM 
SIGCOMM  '88,  August  1988. 

[12]  T.  Dierks  and  C.  Allen.  The  TLS  protocol  version  1.0. 
Internet  Request  for  Comments  RFC  2246,  January 
1999.  Proposed  standard. 

[13]  Rosario  Gennaro  and  Pankaj  Rohatgi.  How  to  Sign 
Digital  Streams.  Technical  report,  IBM  T.J.Watson 
Research  Center,  1997. 

[14]  Oded  Goldreich.  Foundations  of  cryptography  (frag¬ 
ments  of  a  book),  http :  //www.  toe .  les  .mit . 
edu/\~oded/ frag .  html,  1998. 

[15]  S.  Goldwasser,  S.  Micali,  and  R.  Rivest.  A  digi¬ 
tal  signature  scheme  secure  against  adaptive  chosen- 
message  attacks.  SIAM  Journal  of  Computing , 
17(2):28 1-308,  April  1988. 


15 


[16]  Mark  Handley.  Private  communication  with  Adrian 
Perrig,  February  2000. 

[17]  IBM.  Java  web  page,  http://www.ibm.com/ 
developer/ j  ava. 

[18]  Ipsec.  IP  Security  Protocol,  IETF  working  group, 
http : / /www . ietf . org/html . charters/ 
ipsec -  charter . html. 

[19]  Michael  George  Luby.  Pseudorandomness  and  Cryp¬ 
tographic  Applications.  Princeton  Computer  Science 
Notes,  1996. 

[20]  R.  C.  Merkle.  A  certified  digital  signature.  In  Gilles 
Brassard,  editor,  Advances  in  Cryptology  -  Crypto  '59, 
pages  218-238,  Berlin,  1989.  Springer- Verlag.  Lec¬ 
ture  Notes  in  Computer  Science  Volume  435. 

[2 1  ]  Ralph  Merkle.  Protocols  for  public  key  cryptosystems. 
In  1980  IEEE  Symposium  on  Security  and  Privacy , 
1980. 

[22]  David  L.  Mills.  Network  Time  Protocol  (Version  3) 
Specification,  Implementation  and  Analysis.  Internet 
Request  for  Comments,  March  1992.  RFC  1305. 

[23]  M.  Naor  and  M.  Yung.  Universal  one-way  hash  func¬ 
tions  and  their  cryptographic  applications.  In  Proceed¬ 
ings  of  the  Twenty  First  Annual  ACM  Symposium  on 
Theory  of  Computing  (STOC  '89),  1989. 

[24]  V.  Paxson.  End-to-end  internet  packet  dynamics. 
IEEE/ACM  Transactions  on  Networking ,  7(3):277- 
292,  June  1999. 

[25]  M.  O.  Rabin.  The  information  dispersal  algorithm  and 
its  applications,  1990. 

[26]  Ronald  L.  Rivest.  The  MD5  message-digest  algo¬ 
rithm.  Internet  Request  for  Comments,  April  1992. 
RFC  1321. 

[27]  Ronald  L.  Rivest,  Adi  Shamir,  and  Leonard  M.  Adle- 
man.  A  method  for  obtaining  digital  signatures  and 
public-key  cryptosystems.  Communications  of  the 
i4CAf,  2I(2):120-126, 1978. 

[28]  Pankaj  Rohatgi.  A  compact  and  fast  hybrid  signa¬ 
ture  scheme  for  multicast  packet  authentication.  In  6th 
ACM  Conference  on  Computer  and  Communications 
Security ,  November  1999. 

[29]  Secure  Multicast  User  Group  (SMUG),  http: // 
www. ipmulticast . com/ community/ smug/. 


[30]  Paul  F.  Syverson,  Stuart  G.  Stubblebine,  and  David  M. 
Goldschlag.  Unlinkable  serial  transactions.  In  Finan¬ 
cial  Cryptography  '97,  Springer  Verlag,  LNCS  1318 , 
1997. 

[31]  C.  K.  Wong  and  S.  S.  Lam.  Digital  signatures  for  flows 
and  multicasts.  In  Proc.  IEEEICNP  ' 98 , 1998. 

[32]  M.  Yajnik,  S.  Moon,  J.  Kurose,  and  D.  Towsley.  Mea¬ 
surement  and  modelling  of  the  temporal  dependence 
in  packet  loss.  In  IEEE  INFOCOM  '99,  New  York, 
NY,  March  1999. 

A  Proof  of  Security 

In  this  appendix,  we  present  a  more  formal  statement  of 
the  security  assumptions  on  our  cryptographic  primitives 
and  sketch  the  proof  of  security  for  one  of  our  stream  au¬ 
thentication  schemes.  First,  here  are  primitives  we  use  in 
our  schemes. 

Message  Authentication  Codes  (MACs).  A  function 
family  {fk}k£{o,iy  (where  t  is  the  key  length,  taken  to  be 
the  security  parameter)  is  a  secure  MAC  family  if  any  adver¬ 
sary  A  (whose  resources  are  bounded  by  a  polynomial  in  t) 
succeeds  in  the  following  game  only  with  negligible  prob¬ 
ability.  A  random  £-bit  key  k  is  chosen;  next  A  can  adap¬ 
tively  choose  messages  mi , . . .  ,  mn  and  receive  the  corre¬ 
sponding  MAC  values  /jt(mi) . . .  /fc(mn).  A  succeeds  if 
it  manages  to  forge  the  MAC,  i.e.,  if  it  outputs  a  pair  m,t 
where  ,mn  and  t  =  /*(m).  See  [2]  for  more 

details. 

Pseudorandom  functions  (PRFs).  A  function  family 
{/*}fee(o, i}*  (where  t  is  the  key  length,  taken  to  be  the 
security  parameter)  is  a  pseudorandom  function  family  if 
any  adversary  A  (whose  resources  are  bounded  by  a  polyno¬ 
mial  in  f)  cannot  distinguish  between  a  function  /*  (where 
k  is  chosen  randomly  and  kept  secret)  and  a  totally  random 
function  only  with  negligible  probability.  That  is,  a  func¬ 
tion  g  is  chosen  to  be  either  /*  for  a  random  f-bit  key,  or  a 
random  function  with  the  same  range.  Next  A  gets  to  ask 
the  value  of  g  on  as  many  points  as  it  likes.  Nonetheless  A 
should  be  unable  to  tell  whether  g  is  random  or  pseudoran¬ 
dom.  see  [14,  19]  for  more  details. 

The  schemes  below  make  use  of  the  following  property 
of  pseudorandom  functions:  as  long  as  the  key  k  is  ran¬ 
dom  (or  pseudorandom)  and  remains  unknown,  the  value 
ki  =  fk(x)  is  also  pseudorandom  for  any  fixed  and  known 
x.  (In  our  schemes  we  use  the  arbitrary  value  x  —  0.) 
This  allows  us  to  securely  iterate;  that  is,  fc2  =  fkx{x) 
is  also  pseudorandom,  and  so  on.  Furthermore,  the  value 
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k[  =  fk{xf)  where  x  ^  x'  is  cryptographically  indepen¬ 
dent  from  ki  (as  long  as  k  remains  secret)  and  can  be  used 
as  a  key  for  different  cryptographic  transforms  (such  as  a 
MAC). 

Target  collision  resistance.  A  function  family 
{/*}jte{o,i}‘  (where  £  is  the  key  length,  taken  to  be 
the  security  parameter)  is  Target  Collision  Resistant  if  any 
adversary  A  (whose  resources  are  bounded  by  a  polynomial 
in  £)  can  win  in  the  following  game  only  with  negligible 
probability.  First  A  generates  a  value  V\  in  the  common 
domain  of  {/*}.  Next  an  £-b\t  key  k  is  randomly  chosen 
and  given  to  A.  Next  A  wins  if  it  generates  v2  such  that 
fk(v i)  =  /fe(u2).  Note  that  target  collision  resistance 
implies  2nd  pre-image  collision  resistance.  See  more 
details  in  [3, 23]. 

In  our  scheme  we  use  a  PRF  family  {/*}  that  also  has  the 
following  flavor  of  target  collision  resistance.  First  a  key 
k  is  chosen  at  random,  and  the  adversary  is  given  /*(()). 
Next  the  adversary  is  assumed  to  be  unable  (except  with 
negligible  probability)  to  find  kl  ^  k  such  that  (0)  = 

A(o). 

Since  any  PRF  family  is  also  a  secure  MAC  family, 
in  our  schemes  we  use  the  same  function  family  for  both 
purposes.  Still,  for  clarity,  in  the  sequel  we  differentiate 
between  the  cryptographic  functionality  of  a  PRF  and  a 
MAC.10 

In  addition,  we  use  digital  signatures  (secure  against  cho¬ 
sen  message  attacks,  see  [15]),  where  the  sender  holds  the 
signing  key  and  all  receivers  hold  the  corresponding  public 
verification  key.  The  way  in  which  the  receivers  obtain  the 
verification  key  is  left  out  of  scope. 

Security  Analysis  of  Scheme  III 

For  brevity,  we  only  sketch  a  proof  of  security  of  one  of 
the  TESLA  schemes,  specifically  Scheme  III. 

Theorem  A.l.  Assume  that  the  PRF,  the  MAC  and  the  sig¬ 
nature  schemes  in  use  are  secure,  and  that  the  PRF  has  the 
TCR  property  described  in  Section  A.  Then  Scheme  IV  is  a 
secure  stream  authentication  scheme. 

Proof  sketch.  For  simplicity  we  assume  that  the  MAC  and 
the  PRF  are  realized  by  the  same  function  family  {/*}.  (In 
our  implementation,  /  =  HMAC.)  Assume  for  contra¬ 
diction  that  Scheme  III  is  not  a  secure  stream  authentica¬ 
tion  scheme.  This  means  that  there  is  an  adversary  A  who 
controls  the  communication  links  and  manages,  with  non- 
negligible  probability,  to  deliver  a  message  m  to  a  receiver 

10In  feet,  we  do  not  need  the  full  security  guarantee  of  a  PRF.  It  suf¬ 
fices  to  have  a  (length-doubling)  pseudorandom  generator  with  a  similar 
TCR  property  to  the  one  described  above.  Nonetheless,  for  simplicity  we 
describe  our  schemes  as  ones  using  a  full-fledged  PRF. 


R ,  such  that  the  sender  S  has  not  sent  m  but  R  accepts  m 
as  authentic  and  coming  from  S. 

We  show  how  to  use  A  to  break  the  security  of  one  of  the 
underlying  cryptographic  primitives  in  use.  Specifically,  we 
construct  a  distinguisher  D  that  uses  A  to  break  the  security 
of  the  function  family  {/*.}.  That  is,  D  gets  access  to  a 
black-box  g  and  can  tell  with  non-negligible  probability  if  g 
is  a  function  /*  where  k  is  a  random  and  secret  key,  or  if  al¬ 
ternatively  g  is  a  totally  random  function.  For  this  purpose, 
D  can  query  g  on  inputs  x  of  its  choice  and  be  answered 
with  g(x). 

Distinguisher  D  works  by  running  A,  as  follows.  Essen¬ 
tially,  D  simulates  for  A  a  network  with  a  sender  S  and  a 
receiver  R.  That  is: 

1.  D  chooses  a  number  i  e  {1..M}  at  random,  where 
M  is  the  total  number  of  messages  to  be  sent  in  the 
stream.  ( D  hopes  that  A  will  forge  the  £tb  message, 
mi.) 

2.  D  chooses  signing  and  verification  keys  for  5,  and 
hands  the  verification  key  to  A. 

3.  D  hands  to  A  the  initial  message  from  S.  This  mes¬ 
sage  is  signed  using  S's  signing  key,  and  contains  the 
key  K0y  plus  the  starting  time  T0  and  the  duration  d 
of  a  time  interval.  The  key  K0  is  generated  as  in  the 
scheme,  with  the  following  twist:  Recall  that  in  the  the 
scheme  K0  =  Fn(Kn)  where  F^z)  = 

and  Kn  is  a  randomly  chosen  value  (with  the  ap¬ 
propriate  length).  Here,  K0  =  F*-1(F/_ i)  where 
Kt-i  =  g(0). 

4.  For  the  first  £  -  1  messages  in  the  stream  D  runs  the 
sender’s  algorithm  in  Scheme  III  with  no  modifica¬ 
tions.  Whenever  a  message  mi  (with  i  <  £)  is  gen¬ 
erated,  it  is  handed  to  A. 

5.  Message  mi  is  generated  as  in  Scheme  III,  with  the 
following  exception:  In  the  scheme,  the  MAC  in  mi 
should  equal  fKt(Mi,Kt- 1)  where  Mi  is  the  ac¬ 
tual  data  in  message  mi.  Here,  D  lets  the  MAC  be 
g(Mi,Ki-\). 

6.  D  inspects  the  messages  that  A  delivers  to  the  re¬ 
ceiver  R  from  the  moment  A  receives  mi  and  until 
time  T0  +  £  •  d.  (All  times  are  taken  locally  within  D.) 
If  A  delivers  a  message  m'  that  is  different  than  mi 
and  has  a  valid  MAC  with  respect  to  g  (i.e.,  m1  is  of 
the  form  m'  =  (M\  K' ,  g(M' ,  K’)))  D  decides  that  g 
was  chosen  from  the  pseudorandom  family  {/*}.  Oth¬ 
erwise  (i.e.,  A  does  not  successfully  forge  a  message) 
D  decides  that  g  is  a  random  function. 

We  sketch  the  argument  demonstrating  that  D  succeeds 

with  non-negligible  probability.  If  g  is  a  truly  random  func¬ 
tion  then  A  has  only  negligible  probability  to  successfully 
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forge  the  £th  message  in  the  stream.  Therefore,  if  g  is  ran¬ 
dom  then  D  makes  the  wrong  decision  only  with  negligible 
probability. 

On  the  other  hand,  we  have  assumed  that  if  the  authenti¬ 
cation  is  done  using  {/*}  then  A  forges  some  message  with 
non-negligible  probability  e.  It  follows  that  A  forges  the 
£th  message  with  probability  at  least  e/l.  Furthermore,  our 
timing  assumption  guarantees  that  A  does  so  prior  to  time 
To  4*  l  •  d.  It  follows  that  if  g  is  taken  from  {/*}  then  D 
makes  the  right  decision  with  probability  at  least  e/t  (which 
is  non-negligible). 

We  remark  that  the  above  argument  fails  if  A  hands  R 
a  forged  initial  message  from  S ,  or  if  for  some  i  <  £  ad¬ 
versary  A  finds  a  key  K[  that  is  different  than  Ki ,  before 
time  To  +  i  •  d.  However,  in  these  cases  the  security  of  the 
signature  scheme  or  the  target  collision  resistance  of  {/*} 
is  compromised,  respectively.  □ 
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Abstract 


One  of  the  main  challenges  of  securing  multicast  com¬ 
munication  is  source  authentication,  or  enabling  receivers 
of  multicast  data  to  verify  that  the  received  data  origi¬ 
nated  with  the  claimed  source  and  was  not  modified  en- 
route.  The  problem  becomes  more  complex  in  common 
settings  where  other  receivers  of  the  data  are  not  trusted, 
and  where  lost  packets  are  not  retransmitted. 

Several  source  authentication  schemes  for  multicast 
have  been  suggested  in  the  past,  but  none  of  these  schemes 
is  satisfactorily  efficient  in  all  prominent  parameters.  We 
recently  proposed  a  very  efficient  scheme,  TESLA,  that  is 
based  on  initial  loose  time  synchronization  between  the 
sender  and  the  receivers,  followed  by  delayed  release  of 
keys  by  the  sender. 

This  paper  proposes  several  substantial  modifications 
and  improvements  to  TESLA.  One  modification  allows  re¬ 
ceivers  to  authenticate  most  packets  as  soon  as  they  arrive 
(whereas  TESLA  requires  buffering  packets  at  the  receiver 
side,  and  provides  delayed  authentication  only).  Other 
modifications  improve  the  scalability  of  the  scheme,  re¬ 
duce  the  space  overhead  for  multiple  instances,  increase 
its  resistance  to  denial-of-service  attacks,  and  more. 


1  Introduction 

With  the  growth  and  commercialization  of  the  Internet, 
simultaneous  transmission  of  data  to  multiple  receivers 
becomes  a  prevalent  mode  of  communication.  Often  the 
transmitted  data  is  streamed  and  has  considerable  band¬ 
width.  To  avoid  having  to  send  the  data  separately  to  each 
receiver,  several  multicast  routing  protocols  have  been 
proposed  and  deployed,  typically  in  the  IP  layer.  (Exam¬ 
ples  include  [12, 13, 23,  16, 6]).  The  underlying  principle 
of  multicast  communication  is  that  each  data  packet  sent 
from  the  source  reaches  a  number  of  receivers. 

Securing  multicast  communication  introduces  a  number 
of  difficulties  that  are  not  encountered  when  trying  to  se¬ 


cure  unicast  communication.  See  [9]  for  a  taxonomy  of 
multicast  security  concerns  and  some  solutions.  A  major 
concern  is  source  authentication ,  or  allowing  a  receiver  to 
ensure  that  the  received  data  is  authentic  (i.e.,  it  originates 
with  the  source  and  was  not  modified  on  the  way),  even 
when  none  of  the  other  receivers  of  the  data  is  trusted. 
Providing  source  authentication  for  multicast  communi¬ 
cation  is  the  focus  of  this  work. 

Simply  deploying  the  standard  point-to-point  authenti¬ 
cation  mechanism  (i.e  appending  a  message  authentica¬ 
tion  code  to  each  packet,  computed  using  a  shared  key) 
does  not  provide  source  authentication  in  the  case  of  mul¬ 
ticast.  The  problem  is  that  any  receiver  that  has  the  shared 
key  can  forge  data  and  impersonate  the  sender.  Conse¬ 
quently,  it  is  natural  to  look  for  solutions  based  on  asym¬ 
metric  cryptography  to  prevent  this  attack,  namely  digi¬ 
tal  signature  schemes.  Indeed,  signing  each  data  packet 
provides  good  source  authentication;  however,  it  has  high 
overhead,  both  in  terms  of  time  to  sign  and  verify,  and 
in  terms  of  bandwidth.  Several  schemes  were  proposed 
that  mitigate  this  overhead  by  amortizing  a  single  signa¬ 
ture  over  several  packets,  e.g.  [14, 33, 29].  However,  none 
of  these  schemes  is  fully  satisfactory  in  terms  of  band¬ 
width  and  processing  time,  especially  in  a  setting  where 
the  transmission  is  lossy  and  some  data  packets  may  never 
arrive.  Even  though  some  schemes  amortize  a  digital 
signature  over  multiple  data  packets,  a  serious  denial-of- 
service  attack  is  usually  possible  where  an  attacker  floods 
the  receiver  with  bogus  packets  supposedly  containing  a 
strong  signature.  Since  signature  verification  is  computa¬ 
tionally  expensive,  the  receiver  is  overwhelmed  verifying 
the  signatures. 

Another  approach  to  providing  source  authentication 
uses  only  symmetric  cryptography,  more  specifically  on 
message  authentication  codes  (MACs),  and  is  based  on 
delayed  disclosure  of  keys  by  the  sender.  This  technique 
was  first  used  by  Cheung  [1 1]  in  the  context  of  authenti¬ 
cating  communication  among  routers.  It  was  then  used  in 
the  Guy  Fawkes  protocol  [1]  for  interactive  unicast  com¬ 
munication.  In  the  context  of  multicast  streamed  data  it 


was  proposed  by  several  authors  [8,  4,  5,  25].  In  partic¬ 
ular,  the  TESLA  scheme  described  in  [25]  was  presented 
to  the  reliable  multicast  transport  (RMT)  working  group 
[26]  of  the  IETF  and  the  secure  multicast  (SMuG)  work¬ 
ing  group  [30]  of  the  IRTF  and  was  favorably  received. 
TESLA  is  particularly  well  suited  to  provide  the  source 
authentication  functionality  for  the  MESP  header  [10],  or 
for  the  ALC  protocol  proposed  by  the  RMT  [19].  Conse¬ 
quently,  an  Internet-Draft  describing  the  scheme  was  re¬ 
cently  written  [24]. 

The  main  idea  of  TESLA,  is  to  have  the  sender  attach  to 
each  packet  a  MAC  computed  using  a  key  k  known  only 
to  itself.  The  receiver  buffers  the  received  packet  with¬ 
out  being  able  to  authenticate  it.  If  the  packet  is  received 
too  late,  it  is  discarded.  A  short  while  later,  the  sender  dis¬ 
closes  k  and  the  receiver  is  able  to  authenticate  the  packet. 
Consequently,  a  single  MAC  per  packet  suffices  to  pro¬ 
vide  source  authentication,  provided  that  the  receiver  has 
synchronized  its  clock  with  the  sender  ahead  of  time. 

This  idea  seems  quite  attractive  at  first.  However,  it  has 
several  shortcomings.  This  work  points  to  these  short¬ 
comings  and  proposes  methods  to  overcome  them.  Our 
description  is  based  mostly  on  TESLA,  although  the  im¬ 
provements  apply  to  the  other  schemes  as  well.  We  sketch 
some  of  these  points: 

1.  In  TESLA  the  receiver  has  to  buffer  packets,  until 
the  sender  discloses  the  corresponding  key,  and  un¬ 
til  the  receiver  authenticates  the  packets.  This  may 
delay  delivering  the  information  to  the  application, 
may  cause  storage  problems,  and  also  generates  vul¬ 
nerability  to  denial-of-service  (DoS)  attacks  on  the 
receiver  (by  flooding  it  with  bogus  packets).  We  pro¬ 
pose  a  method  that  allows  receivers  to  authenticate 
most  packets  immediately  upon  arrival,  thus  reduc¬ 
ing  the  need  for  buffering  at  the  receiver  side  and 
in  particular  reduces  the  susceptibility  to  this  type  of 
DoS  attacks. 

This  improvement  comes  at  the  price  of  one  extra 
hash  per  packet,  plus  some  buffering  at  the  sender 
side.  We  believe  that  buffering  at  the  sender  side  is 
often  more  reasonable  and  acceptable  than  buffering 
at  the  receiver  side.  In  particular,  it  is  not  susceptible 
to  this  type  of  DoS  attacks. 

We  also  propose  other  methods  to  alleviate  this  type 
of  DoS  attacks.  These  methods  work  even  when  the 
receiver  buffers  packets  as  in  TESLA. 

2.  When  operating  in  an  environment  with  heteroge¬ 
nous  network  delay  times  for  different  receivers, 
TESLA  authenticates  each  packet  using  multiple 
keys,  where  the  different  keys  have  different  disclo¬ 
sure  delay  times.  This  results  in  larger  overhead,  both 
in  processing  time  and  in  bandwidth.  We  propose 


a  method  for  achieving  the  same  functionality  (i.e., 
different  receivers  can  authenticate  the  packets  at  dif¬ 
ferent  delays)  with  a  more  moderate  increase  in  the 
overhead  per  packet. 

3.  In  TESLA  the  sender  needs  to  perform  authenticated 
time  synchronization  individually  with  each  receiver. 
This  may  not  scale  well,  especially  in  cases  where 
many  receivers  wish  to  join  the  multicast  group  and 
synchronize  with  the  sender  at  the  same  time.  This 
is  so,  since  each  synchronization  involves  a  costly 
public-key  operation.  We  propose  a  method  that  uses 
only  a  single  public-key  operation  per  time-unit,  re¬ 
gardless  of  the  number  of  time  synchronizations  per¬ 
formed  during  this  time  unit.  This  reduces  the  cost  of 
synchronizing  with  a  receiver  to  practically  the  cost 
of  setting  up  a  simple,  unauthenticated  connection. 

4.  We  also  explore  time  synchronization  issues  in 
greater  depth  and  describe  direct  and  indirect  time 
synchronization.  For  the  former  method,  the  receiver 
synchronizes  its  time  directly  with  the  sender,  in  the 
latter  method  both  the  sender  and  receiver  synchro¬ 
nize  their  time  with  a  time  synchronization  server. 

For  both  cases,  we  give  a  detailed  analysis  on  how  to 
choose  the  key  disclosure  delay,  a  crucial  parameter 
for  TESLA. 

5.  TESLA  assumes  that  all  members  have  joined  the 
group  and  have  synchronized  with  the  sender  be¬ 
fore  any  transmission  starts.  In  reality,  receivers  may 
wish  to  join  after  the  transmission  has  started;  fur¬ 
thermore,  receivers  may  wish  to  receive  the  transmis¬ 
sion  immediately,  and  perform  the  time  synchroniza¬ 
tion  only  later.  We  propose  methods  that  enable  both 
functionalities.  That  is,  our  methods  allow  a  receiver 
to  join  in  “on  the  fly”  to  an  ongoing  session;  they  also 
allow  receivers  to  synchronize  at  a  later  time  and  au¬ 
thenticate  packets  only  then. 

Organization  Section  2  reviews  TESLA,  providing  fur¬ 
ther  details  than  in  [25].  Section  3  contains  the  improve¬ 
ments  and  extensions  proposed  in  this  paper.  Section  4 
provides  further  discussion  on  the  security  of  the  im¬ 
proved  scheme,  with  emphasis  on  resistance  to  denial-of- 
service  attacks. 

2  An  Overview  of  TESLA 

The  security  property  TESLA  guarantees  is  that  the  re¬ 
ceiver  never  accepts  M*  as  an  authentic  message  unless 
Mi  was  actually  sent  by  the  sender.  Note  that  TESLA 
does  not  provide  non-repudiation,  that  is,  the  receiver  can¬ 
not  convince  a  third  party  that  the  stream  arrived  from  the 
claimed  source. 


TESLA  is  efficient  and  has  a  low  space  overhead  mainly 
because  it  is  based  on  symmetric-key  cryptography.  Since 
source  authentication  is  an  inherently  asymmetric  prop¬ 
erty  (all  the  receivers  can  verify  the  authenticity  but  they 
cannot  produce  an  authentic  data  packet),  we  use  a  de¬ 
layed  disclosure  of  keys  to  achieve  this  property.  Simi¬ 
larly,  the  data  authentication  is  delayed  as  well.  In  prac¬ 
tice,  the  authentication  delay  is  on  the  order  of  one  round- 
trip-time  (RTT). 

TESLA  has  the  following  properties.  First,  it  has  a  low 
computation  overhead,  which  is  typically  only  one  MAC 
function  computation  per  packet,  for  both  sender  and  re¬ 
ceiver.  TESLA  also  has  a  low  per-packet  communication 
overhead,  which  is  about  20  bytes  per  packet.  In  addi¬ 
tion,  TESLA  tolerates  arbitrary  packet  loss.  Each  packet 
that  is  received  in  time  can  be  authenticated.  Except  for 
an  initial  time  synchronization,  it  has  only  unidirectional 
data  flow  from  the  sender  to  the  receiver.  No  acknowl¬ 
edgments  or  other  messages  are  necessary.  This  implies 
that  the  sender’s  stream  authentication  overhead  is  inde¬ 
pendent  of  the  number  of  receivers,  hence  TESLA  is  very 
scalable.  TESLA  can  be  used  both  in  the  network  layer  or 
in  the  application  layer.  The  delayed  authentication,  how¬ 
ever,  requires  buffering  of  packets  until  authentication  is 
completed. 

For  TESLA  to  be  secure,  the  sender  and  the  receiver 
need  to  be  loosely  time  synchronized,  which  means  that 
the  synchronization  does  not  need  to  be  precise,  but  the 
receiver  needs  to  know  an  upper  bound  on  the  sender’s 
time. 

2.1  Sender  Setup 

In  our  model,  a  sender  distributes  a  stream  of  data  com¬ 
posed  of  message  chunks  {Mi}.  Generally,  the  sender 
sends  each  message  chunk  Af*  in  one  network  packet  Pj. 
Many  multicast  distribution  protocols  do  not  retransmit 
lost  packets.  The  goal  is  therefore  that  the  receiver  can 
authenticate  each  message  chunk  M{  separately. 

For  the  purpose  of  TESLA,  the  sender  splits  the  time 
into  even  intervals  /*.  We  denote  the  duration  of  each  time 
interval  with  Tint,  and  the  starting  time  of  the  interval  h  is 
Ti.  Trivially,  we  have  T{  =  T0  +  i  *  Tint.  In  each  interval, 
the  sender  may  send  zero  or  multiple  packets. 

Before  sending  the  first  message,  the  sender  determines 
the  sending  duration  (possibly  infinite),  the  interval  dura¬ 
tion,  and  the  number  N  of  keys  of  the  key  chain.  This 
key  chain  is  analogous  to  the  one-way  chain  introduced 
by  Lamport  [18],  and  the  S/KEY  authentication  scheme 
[15].  The  sender  picks  the  last  key  KN  of  the  key  chain 
randomly  and  pre-computes  the  entire  key  chain  using  a 
pseudo-random  function  F,  which  is  by  definition  a  one¬ 
way  function.  Each  element  of  the  chain  is  defined  as 
Ki  =  F(Ki+ 1).  Each  key  can  be  derived  from  Kn  as 


Ki  =  FN"i{KN\  where  F*{k)  =  F*-l{F(k))  and 
F°(k)  =  k.  Each  key  of  the  key  chain  corresponds  to 
one  interval,  i.e.,  Kj  is  active  in  interval  Ij. 

Since  we  do  not  want  to  use  the  same  key  multiple 
times  in  different  cryptographic  operations,  we  use  a  sec¬ 
ond  pseudo-random  function  F’  to  derive  the  key  which 
is  used  to  compute  the  MAC  of  messages  in  each  inter¬ 
val  (we  will  explain  the  algorithm  in  detail  later).  Hence, 
K[  =  F'(Ki).  Figure  1  depicts  this  key  derivation.  We 
propose  to  use  HMAC  in  conjunction  with  a  cryptograph¬ 
ically  secure  hash  function  for  the  pseudo-random  func¬ 
tion  [2].  For  example,  a  possibility  is  to  use  the  following: 
F{x)  =  HMAC  (a;,  0)  and  F'{x)  =  HMAC(x,  1),  where 
0  and  1  are  8-bit  integers.  Note  that  the  first  argument  of 
the  MAC  function  is  the  key  and  the  second  argument  is 
the  data. 

2.2  Bootstrapping  a  new  Receiver 

TESLA  requires  an  initially  authenticated  data  packet  to 
bootstrap  a  new  receiver.  This  authentication  is  achieved 
with  a  digital  signature  scheme,  such  as  RSA  [28],  or  DSA 
[32]. 

We  consider  two  options  for  synchronizing  the  time,  di¬ 
rect  and  indirect  synchronization.  We  improve  the  time 
synchronization  from  our  original  work  and  describe  the 
details  in  section  3.3.  Whichever  time  synchronization 
mechanism  is  used,  the  receiver  only  needs  to  know  an 
upper  bound  on  the  sender  time. 

The  initial  authenticated  packet  contains  the  following 
information  about  the  time  intervals  and  key  chain: 

•  The  beginning  time  of  a  specific  interval  Tj ,  along 
with  its  id  Ij 

•  The  interval  duration  Tint 

•  Key  disclosure  delay  d  (unit  is  interval) 

•  A  commitment  to  the  key  chain  Ki(i  <  j~d  where 
j  is  the  current  interval  index) 

2.3  Sending  Authenticated  Packets 

Each  key  of  the  key  chain  is  used  in  one  time  interval. 
However  many  messages  are  sent  in  each  interval,  the  key 
which  corresponds  to  that  interval  is  used  to  compute  the 
MAC  of  all  those  messages.  This  allows  the  sender  to 
send  packets  at  any  rate  and  to  adapt  the  sending  rate  dy¬ 
namically.  The  key  remains  secret  for  d-1  future  intervals. 
Packets  sent  in  interval  Ij  can  hence  disclose  key  Kj-d- 
As  soon  as  the  receivers  receive  that  key,  they  can  verify 
the  authenticity  of  the  packets  sent  in  interval  Ij-d • 

The  construction  of  packet  Pj  sent  in  interval  U  is: 
{Mj  |  MAC(Ffj,  Mj)  |  Ki-d}- 


Figure  1  shows  the  key  chain  construction  and  the  MAC 
key  derivation.  If  the  disclosure  delay  is  2  intervals,  the 
packet  PJ+ 4  sent  in  interval  !*+ 2  discloses  key  K{.  From 
this  key,  the  receiver  can  also  recover  K^i  and  verify  the 
MAC  of  Pj,  in  case  PJ+3  is  lost. 

2.4  Receiver  Tasks 

Since  the  security  of  TESLA  depends  on  keys  that  re¬ 
main  secret  until  a  pre-determined  time  period,  the  re¬ 
ceiver  must  verify  for  each  packet  that  the  key,  which  is 
used  to  compute  the  MAC  of  that  packet,  is  not  yet  dis¬ 
closed  by  the  sender.  Otherwise,  an  attacker  could  have 
changed  the  message  data  and  re-computed  the  MAC. 
This  motivates  the  security  condition,  which  the  receiver 
must  verify  for  each  packet  it  receives. 

Security  condition:  A  packet  arrived  safely ,  if  the  re¬ 
ceiver  is  assured  that  the  sender  cannot  yet  be  in  the  time 
interval  in  which  the  corresponding  key  is  disclosed. 

The  intuition  is  that  if  a  packet  satisfies  the  security  con¬ 
dition,  then  no  attacker  could  have  altered  it  in  transit,  be¬ 
cause  the  corresponding  MAC  key  is  not  yet  disclosed.  In 
case  the  security  condition  is  not  valid,  the  receiver  must 
drop  that  packet,  because  the  authenticity  is  not  assured 
any  more.  We  would  like  to  emphasize  that  the  security  of 
this  scheme  does  not  rely  on  any  assumptions  on  network 
propagation  delay.  The  original  paper  sketches  a  security 
proof  [25]. 

We  now  explain  how  the  authentication  with  TESLA 
works  with  a  concrete  example.  When  the  receiver  re¬ 
ceives  packet  Pj  sent  in  interval  f  at  local  time  £c,  it 
computes  an  upper  bound  on  the  sender’s  clock  tj  (we 
describe  in  section  3.3  how  to  compute  this).  To  evalu¬ 
ate  the  security  condition,  the  receiver  computes  the  high¬ 
est  interval  x  the  sender  could  possibly  be  in,  which  is 
x  =  [(tj  -  To)/Tint\.  The  receiver  now  verifies  that 
x  <  Ii  +  d  (where  J*  is  the  interval  index),  which  means 
that  the  sender  must  not  have  been  in  the  interval  in  which 
the  key  Ki  is  disclosed,  hence  no  attacker  can  possibly 
know  that  key  and  spoof  the  message  contents. 

The  receiver  cannot,  however,  verify  the  authentic¬ 
ity  of  the  message  yet.  Instead,  it  stores  the  triplet 
{/*,  Mj ,  MAC(i£,-,  Mj))  to  verify  the  authenticity  later 
when  it  knows  K[.  Two  possibilities  exist  on  how  to 
handle  the  unauthenticated  message  chunk  Mj .  The  first 
possibility  is  to  hand  Mj  to  the  application,  and  notify  it 
through  a  callback  mechanism  as  soon  as  Mj  is  verified. 
The  second  possibility  is  to  buffer  Mj  until  the  authentic¬ 
ity  can  be  checked  and  pass  it  to  the  application  as  soon  as 
Mj  is  authenticated. 

If  the  packet  contains  a  disclosed  key  regardless 
of  whether  the  security  condition  is  verified  or  not,  the 
receiver  checks  whether  it  can  use  to  authenticate 
previous  packets.  Clearly,  if  it  has  received  previ¬ 


ously,  it  does  not  have  any  work  to  do.  Otherwise,  let 
us  assume  that  the  last  key  value  in  the  reconstructed  key 
chain  is  Kv .  The  receiver  verifies  if  Ki-a  is  legitimate  by 
verifying  that  Kv  =  F‘"d“v(JK‘i_d).  If  that  condition  is 
correct,  the  receiver  updates  the  key  chain.  For  each  new 
key  Kw,  it  computes  K'w  =  F'(KW)  which  might  allow  it 
to  verify  the  authenticity  of  previously  received  packets. 

It  is  clear  that  this  system  can  tolerate  arbitrary  packet 
loss,  because  the  receiver  can  verify  the  authenticity  of  all 
received  packets  that  satisfy  the  security  condition  even¬ 
tually. 

3  Our  Extensions 

We  extend  TESLA  in  a  number  of  ways  to  make  it  more 
efficient  and  practical.  First,  we  present  a  new  method 
to  support  immediate  authentication ,  meaning  that  the  re¬ 
ceiver  can  authenticate  packets  as  soon  as  they  arrive. 

Second,  we  propose  optimizations  concerning  key 
chains.  In  particular,  for  applications  that  use  multiple 
authentication  chains  with  different  disclosure  delays,  we 
present  a  new  algorithm  that  reduces  the  communication 
overhead. 

Finally,  we  give  discussions  on  the  time  synchroniza¬ 
tion  issues  and  derive  a  tight  lower  bound  on  the  key  dis¬ 
closure  delay,  which  makes  the  scheme  much  more  practi¬ 
cal.  Next,  we  remove  a  scalability  limitation  of  the  simple 
time  synchronization  protocol.  Furthermore,  we  discuss 
how  a  receiver  can  authenticate  received  packets  even  if 
it  is  not  time  synchronized  at  the  moment  in  which  it  re¬ 
ceives  the  packet. 

3.1  Immediate  Authentication 

A  drawback  of  the  original  TESLA  protocol  is  that  the 
receiver  needs  to  buffer  packets  during  one  disclosure  de¬ 
lay  before  it  can  authenticate  them.  This  might  not  be 
practical  for  certain  applications  if  the  receivers  cannot  af¬ 
ford  much  buffer  space  and  bursty  traffic  might  cause  the 
receivers  to  drop  packets  due  to  insufficient  buffer  space. 
Moreover,  as  we  show  later  in  section  4.2,  the  require¬ 
ment  of  receiver  buffering  introduces  a  vulnerability  to  a 
denial-of-service  attack.  To  solve  these  problems  caused 
by  receiver-buffering,  we  propose  a  new  method  to  sup¬ 
port  immediate  authentication ,  which  allows  the  receiver 
to  authenticate  packets  as  soon  as  they  arrive. 

The  basic  observation  of  this  method  is  that  we  can 
replace  receiver  buffering  with  sender  buffering.  If  the 
sender  can  buffer  packets  during  one  disclosure  delay, 
then  it  could  store  the  hash  value  of  the  data  of  a  later 
packet  in  an  earlier  packet  and  hence  as  soon  as  the  ear¬ 
lier  packet  is  authenticated,  the  data  in  the  later  packet  is 
authenticated  through  the  hash  value  as  well. 

In  the  new  scheme,  the  sender  buffers  packets  for  the 
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Figure  1:  TESLA  key  chain  and  the  derived  MAC  keys 


duration  of  one  disclosure  delay.  For  simplicity  of  illus¬ 
tration,  we  assume  that  the  sender  sends  out  a  constant 
number  v  of  packets  per  time  interval.  To  construct  the 
packet  for  the  message  chunk  Mj  in  time  interval  Tj, 
the  sender  appends  the  hash  value  of  the  message  chunk 
Mj+Vd  to  Mj  and  then  computes  the  MAC  value  also  over 
H(Mj+vd)  with  the  key  Ki.  Figure  2  illustrates  how 
the  packet  Pj  is  constructed  by  appending  H(Mj+vd ), 
MA C(Ki,  Mj  |  H(Mj+v<t))>  along  with  the  disclosed  key 
Ki-d-  (Note  that  the  |  stands  for  message  concatenation). 
When  the  packet  Pj+vd  arrives  at  the  receiver  which  dis¬ 
closes  the  key  Ki  it  allows  authentication  of  packet  Pj 
sent  in  interval  1+  Pj  carries  a  hash  of  the  data  Mj+vd  in 
Pj+vd-  If  Pj  is  authentic,  H(Mj+vd)  is  also  authentic  and 
therefore  the  data  Mj+vd  is  immediately  authenticated. 
Also  note  that  if  Pj  is  lost  or  dropped  due  to  violation 
of  the  security  condition,  Pj+Vd  will  not  be  immediately 
authenticated  and  can  still  be  authenticated  later  using  the 
MAC  value. 


Figure  2:  Immediate  authentication  packet  exam¬ 
ple.  Dj  -  H (Mj+Vd)  I  Mj  and  Dj+vd  = 
K(Mj+2vd)  |  Mj+vd- 

If  each  packet  can  only  carry  the  hash  of  one  other 
packet,  it  is  clear  that  the  sending  rate  needs  to  remain 
constant.  Also  it  is  clear  that  if  a  packet  is  lost,  the  corre¬ 
sponding  packet  cannot  be  immediately  authenticated.  To 
achieve  flexibility  for  dynamic  sending  rate  and  robust¬ 
ness  to  packet  loss,  the  sender  can  add  the  hash  values  of 


multiple  future  packets  to  a  packet,  similar  to  the  EMSS 
scheme  [25]. 

3.2  Concurrent  TESLA  instances 


In  this  section,  we  present  a  space  optimization  tech¬ 
nique  in  the  case  the  sender  uses  multiple  TESLA  in¬ 
stances  for  one  stream. 

Choosing  the  disclosure  delay  involves  a  tradeoff.  Re¬ 
ceivers  with  a  low  network  delay  welcome  short  key  dis¬ 
closure  delays  because  that  translates  into  a  short  authenti¬ 
cation  delay.  Unfortunately,  receivers  with  a  long  network 
delay  could  not  operate  with  a  short  disclosure  delay  be¬ 
cause  most  of  the  packets  will  violate  the  security  con¬ 
dition  and  hence  cannot  be  authenticated.  Conversely,  a 
long  disclosure  delay  would  suit  the  long  delay  receivers, 
but  causes  unnecessarily  long  authentication  delay  for  the 
receivers  with  short  network  delay.  The  solution  is  to  use 
multiple  instances  of  TESLA  with  different  disclosure  de¬ 
lays  simultaneously,  and  each  receiver  can  decide  which 
disclosure  delay,  and  hence,  which  instance  to  use.  A  sim¬ 
ple  approach  to  use  concurrent  TESLA  instances  is  to  treat 
each  TESLA  instance  independently,  with  one  key  chain 
per  instance.  The  problem  for  this  approach  is  that  each 
extra  TESLA  instance  also  causes  extra  space  overhead  in 
each  packet.  If  each  instance  requires  20  bytes  per  packet 
(80  bit  for  key  disclosure  and  80  bit  for  the  MAC  value), 
using  three  instances  results  in  60  bytes  space  overhead 
per  packet.  We  present  a  new  optimization  which  reduces 
the  space  overhead  of  concurrent  instances. 

The  main  idea  is  that  instead  of  using  one  independent 
key  chain  per  TESLA  instance,  we  could  use  the  same 
key  chain  but  a  different  key  schedule  for  all  instances. 
The  basic  scheme  works  as  follows.  All  TESLA  instances 
for  a  stream  share  the  same  time  interval  duration  and  the 
same  key  chain.  Each  key  Ki  in  the  key  chain  is  associ¬ 
ated  with  the  corresponding  time  interval  T*,  and  Ki  will 


be  disclosed  in  T*.1  Assume  that  the  sender  uses  w  in¬ 
stances  of  TESLA,  which  we  denote  with  n  . . .  rw.  Each 
TESLA  instance  ru  has  a  different  disclosure  delay  <fu, 
and  it  will  have  a  MAC  key  schedule  derived  from  the  key 
schedule  shifted  by  du  time  intervals  from  the  key  dis¬ 
closure  schedule.  Let  K?+du  denote  the  MAC  key  used 
by  instance  it  in  time  interval  TV  We  derive  K^_du  as 
Ki+du  =  HMAC(f(fj+,ftt ,  u).  Note  that  we  use  HMAC  as 
a  pseudo-random  function,  which  is  the  same  key  deriva¬ 
tion  construction  as  we  use  in  TESLA  (see  section  2 .1  and 
figure  1).  In  fact,  the  keys  of  the  first  instance  are  derived 
with  the  same  pseudo-random  function  as  the  TESLA  pro¬ 
tocol  that  uses  only  one  instance.  The  reason  for  generat¬ 
ing  all  different,  independent  keys  for  each  instance  is  to 
prevent  an  attack  where  an  attacker  moves  the  MAC  value 
of  an  instance  to  another  instance,  which  might  allow  it 
to  claim  that  data  was  sent  in  a  different  interval.  Our 
approach  of  generating  independent  keys  prevents  this  at¬ 
tack.  Thus  to  compute  the  MAC  value  in  packet  Pj  in 
time  interval  TV  the  sender  computes  one  MAC  value  of 
the  message  chunk  Mj  per  instance  and  append  the  MAC 
values  to  Mj.  In  particular,  for  the  instance  tu  with  dis¬ 
closure  delay  du ,  the  sender  will  now  use  the  key  K^du 
as  mentioned  above  for  the  MAC  computation. 

Figure  3  shows  an  example  with  two  TESLA  instances, 
one  with  a  key  disclosure  time  of  two  intervals  and  the 
other  of  four  intervals.  The  lowest  line  of  keys  shows 
the  key  disclosure  schedule,  i.e.  which  key  is  disclosed 
in  which  time  interval.  The  middle  and  top  line  of  keys 
shows  the  key  schedule  of  the  first  and  second  instance 
respectively,  i.e.  which  key  is  used  to  compute  the  MAC 
for  the  packets  in  the  given  time  interval  for  the  given  in¬ 
stance.  Using  this  technique,  the  sender  will  only  need  to 
disclose  one  key  chain  no  matter  how  many  instances  are 
used  concurrently.  If  each  disclosed  key  is  10  bytes  long, 
then  for  a  stream  with  m  concurrent  instances,  this  tech¬ 
nique  will  save  10 (m  -  1)  bytes  per  packet,  which  is  a 
drastic  saving  in  particular  for  small  packets. 

3,3  Time  Synchronization  Issues 

Loose  time  synchronization  is  an  important  component 
in  TESLA.  Although  sophisticated  time  synchronization 
protocols  exist,  they  usually  require  considerable  manage¬ 
ment  overhead.  Furthermore,  they  generally  have  a  high 
complexity  and  achieve  properties  that  TESLA  does  not 
require.  An  example  is  the  network  time  protocol  (NTP) 
by  Mills  [21],  Bishop  performs  a  detailed  security  anal¬ 
ysis  of  NTP  [7].  For  these  reasons,  we  outline  a  simple 
and  secure  time  synchronization  protocol  that  suffices  the 
humble  requirements  of  TESLA. 

'Note  that  this  key  schedule  is  different  from  the  previous  schedule 
described  in  section  2.1,  where  key  K{  was  used  to  compute  the  MAC 
in  interval  /,  and  was  disclosed  in  interval  /*+<*. 


The  time  synchronization  requirement  that  secures 
TESLA  against  an  active  attacker  is  that  the  receiver 
knows  an  upper  bound  of  the  difference  between  the 
sender’s  local  time  and  the  receiver’s  local  time,  A.  For 
simplicity,  we  assume  the  clock  drift  of  both  sender  and 
receiver  are  negligible,  otherwise  they  wilt  simply  resyn¬ 
chronize  periodically.  We  denote  the  real  difference  be¬ 
tween  the  sender  and  the  receiver’s  time  with  8.  Hence  for 
loose  synchronization,  the  receiver  does  not  need  know  8 
but  only  some  A  that  is  guaranteed  to  be  greater  or  equal 
to  8 .  To  compute  A,  we  can  use  either  a  direct  or  an  in¬ 
direct  time  synchronization  method.  In  the  following,  we 
first  discuss  a  simple  protocol  for  direct  time  synchroniza¬ 
tion,  and  next  we  discuss  how  to  do  indirect  time  synchro¬ 
nization. 

Direct  Time  Synchronization 

In  direct  time  synchronization,  the  receiver  performs  an 
explicit  time  synchronization  with  the  sender.  This  ap¬ 
proach  has  the  advantage  that  no  extra  infrastructure  is 
needed  to  perform  the  time  synchronization.  We  design 
a  simple  two-phase  protocol  that  satisfies  the  TESLA  re¬ 
quirements. 

In  the  protocol,  the  receiver  first  records  its  local  send¬ 
ing  time  tR  and  sends  a  time  synchronization  request  con¬ 
taining  a  nonce  to  the  sender.  Upon  receiving  the  time 
synchronization  request,  the  sender  records  its  local  re¬ 
ceiving  time  tg  and  sends  the  receiver  a  signed  response 
packet  containing  tg  and  the  nonce. 

R-*  S  :  Nonce 

S  R:  {Sender  time  tg)  Nonce} 

Figure  4  shows  a  sample  time  synchronization  between 
the  receiver  and  the  sender.  Upon  receiving  the  signed 
response,  the  receiver  checks  the  validity  of  the  signature 
and  the  matching  of  the  nonce  and  computes  A  =  f  s  -  f  tf  • 
It  is  easy  to  see  that  the  A  computed  this  way  satisfies 
the  requirement  that  A  >8.  Because  A  =  —  t*  = 

(ts  —  £3)  +  (£3  —  til),  tg  —  <3  =  8 ,  and  £3  —  tR  is  the 
network  delay  for  sending  the  request  from  the  receiver  to 
the  sender  which  is  greater  or  equal  to  0,  hence  A  >8. 
An  interesting  point  is  that  the  network  delay  of  the  re¬ 
sponse  packet  and  the  delay  caused  by  the  computation 
of  the  digital  signature  do  not  influence  A  at  all.  Since 
only  the  initial  timestamp  matters,  it  is  important  that  the 
sender  immediately  stores  the  arrival  time  ts  of  the  time 
synchronization  request  packet.  The  subsequent  process¬ 
ing  and  propagation  delay  does  not  matter. 

Because  the  digital  signature  operation  is  computation¬ 
ally  expensive,  we  need  to  be  careful  about  denial-of- 
service  attacks  where  an  attacker  floods  the  sender  with 
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Figure  3:  Multiple  TESLA  instances  key  chain  optimization. 


time  synchronization  requests.  Section  4.1  addresses  this 
issue. 


Receiver  time  Sender  time 


Figure  4:  The  receiver  synchronizes  its  time 
with  the  sender. 


Indirect  Time  Synchronization 

In  indirect  time  synchronization,  both  the  sender  and  the 
receivers  synchronize  their  time  with  a  time  reference  and 
hence  the  sender  and  the  receiver  can  reach  implicit  time 
synchronization.  This  approach  is  favorable  especially  in 
cases  where  the  application  needs  time  synchronization 
with  a  time  reference  anyhow.  Let  A  sc  +  \^sc\  denote 
the  measured  upper  bound  of  the  difference  of  the  sender’s 
time  and  the  time  reference’s  time  with  \esc\  as  the  max¬ 
imum  error,  and  let  A cr  +  \zcr\  denote  the  measured 
upper  bound  of  the  difference  of  the  time  reference’s  time 
and  the  receiver’s  time  with  \eCR\  as  the  maximum  error. 
Thus  the  receiver  could  reach  an  implicit  time  synchro¬ 
nization  with  the  sender  as  A  =  A  sc  +  &cr  +  \esc\  + 
\€cr\  with  c  =  |csc|  +  \ccr\  as  the  maximum  error. 

In  settings  where  the  receiver  is  already  time  synchro¬ 
nized  with  the  time  reference,  the  receiver  does  not  need  to 
send  any  information  to  the  sender.  The  sender  just  needs 
to  periodically  broadcast  digitally  signed  packets  that  con¬ 


tain  its  time  synchronization  with  the  time  reference,  the 
time  interval  and  key  chain  information  outlined  in  sec¬ 
tion  2.2,  along  with  the  sender’s  maximum  synchroniza¬ 
tion  error  esc  •  A  new  receiver  can  start  authenticating  the 
data  stream  right  after  it  receives  one  of  the  signed  adver¬ 
tisements.  This  is  particularly  useful  in  the  case  of  satellite 
broadcast. 

Delayed  Time  Synchronization 

Another  interesting  relaxation  of  the  time  synchronization 
requirement  is  that,  if  we  assume  that  the  receiver’s  clock 
drift  is  negligible  during  a  period  of  time,  then  the  receiver 
can  receive  the  data  stream  from  the  sender  before  doing  a 
time  synchronization  and  authenticate  the  data  later  after 
a  time  synchronization.  The  receiver  only  needs  to  store 
the  arrival  time  of  each  packet,  so  that  it  can  evaluate  the 
security  condition  after  it  performed  the  time  synchroniza¬ 
tion.  This  is  highly  useful  for  many  applications,  for  ex¬ 
ample  a  router  can  use  TESLA  to  authenticate  itrace  mes¬ 
sages  [3],  and  the  victim  can  authenticate  the  routers’  IP 
markings  afterwards  when  it  wants  to  trace  an  attacker  by 
performing  an  approximate  time  synchronization  with  the 
router  [31]. 

3.4  Determining  the  Key  Disclosure  Delay 

An  important  parameter  to  determine  for  TESLA  is  the 
key  disclosure  delay  d.  A  short  disclosure  delay  will  cause 
packets  to  violate  the  security  condition  and  cause  packet 
drop,  while  a  long  disclosure  delay  causes  a  long  authen¬ 
tication  delay.  Note  that  although  the  choice  of  the  dis¬ 
closure  delay  does  not  affect  the  security  of  the  system, 
it  is  an  important  performance  factor.  We  describe  a  new 
method  on  how  to  choose  a  good  disclosure  delay  d.  In 
particular,  we  show  as  follows  that  if  RTT  is  a  reasonable 
upper  bound  on  the  round  trip  time  between  the  receiver 
and  the  sender,  then  in  case  of  using  direct  time  synchro¬ 
nization,  we  can  choose  d  —  f  RTT /TW]  + 1,  where  T{nt 


is  the  interval  duration.  In  case  of  indirect  time  synchro¬ 
nization,  we  can  choose  d  =  f  (Dsr +e)/Tint]  + 1,  where 
e  is  the  sum  of  both  the  sender  and  receiver  time  synchro¬ 
nization  error,  and  Dsr  is  a  reasonable  upper  bound  on 
the  network  delay  of  a  packet  traveling  from  the  sender  to 
the  receiver. 

Consider  a  packet  P*  that  is  constructed  using  the  MAC 
key  Kj  in  time  interval  Ij  which  will  be  disclosed  d  time 
intervals  later.  The  packet  Pi  arrives  at  the  receiver  at  its 
local  time  tf.  Hence  the  security  condition  is  that 
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where  T0  is  the  beginning  time  of  the  Oth  time  interval  and 
Tint  the  time  interval  duration.  Assume  packet  Pi  was 
sent  at  the  sender’s  local  time  tf.  Hence  tf  <  Tj  +Tint  = 
Ij  •  T^t  +  To  +  Tint.  We  denote  the  average  network 
delay  time  from  the  sender  to  the  receiver  with  Dsr  and 
the  average  network  delay  time  from  the  receiver  to  the 
sender  is  Drs ,  and  hence  RTT  =  Drs  +  Dsr. 

In  case  of  a  direct  time  synchronization,  using  the  same 
notation  as  in  section  3.3,  A  =  6  +  (t3  -  t*)  =  6  -f 
Drs ,  tf  4-  S  -  tf  =  Dsr ,  and  hence  we  can  derive  at 
the  end  that  a  tight  bound  for  d  to  satisfy  the  equation  1 
is  d  =  \ RTT /Tini\  +  1,  which  allows  most  of  pack¬ 
ets  to  satisfy  the  security  condition  and  still  the  receiver 
would  not  need  to  wait  much  extra  longer  than  necessary 
to  authenticate  the  packets.  Similarly  in  case  of  an  indi¬ 
rect  time  synchronization,  we  can  derive  that  a  good  d  is 
d  =  r(Asj?  +  c)/Tjnt]  +  1. 


4  Security  Discussion  and  Robustness  to 
DoS 

Our  original  paper  did  not  address  denial-of-service 
(DoS)  attacks  on  TESLA.  In  an  IP  multicast  environment, 
however,  DoS  is  a  considerable  threat  and  requires  careful 
consideration.  We  discuss  potential  security  problems  in 
this  section  and  show  how  to  strengthen  TESLA  to  thwart 
them.  In  particular,  we  show  that  there  is  no  DoS  attack  on 
the  sender  if  the  receivers  perform  indirect  time  synchro¬ 
nization.  In  case  of  direct  time  synchronization,  we  show 
how  to  mitigate  DoS  attacks  on  the  sender.  Although  there 
are  some  potential  DoS  attacks  on  the  receiver  side,  we 
show  that  TESLA  does  not  add  any  additional  vulnerabil¬ 
ity  to  DoS  attacks  if  the  receiver  has  a  reasonable  amount 
of  buffer  space,  otherwise  we  describe  schemes  that  alle¬ 
viate  the  exposure  to  DoS. 


sender  does  not  keep  per-receiver  state  or  perform  per- 
receiver  operations.  In  the  case  of  direct  time  synchro¬ 
nization,  a  DoS  attack  is  possible,  since  the  sender  is  re¬ 
quired  to  digitally  sign  each  nonce  included  in  a  time  syn¬ 
chronization  request.  An  attacker  can  perform  a  DoS  by 
flooding  the  sender  with  requests. 

This  response  packet  needs  to  be  authenticated  with  a 
digital  signature  scheme,  such  as  RSA  [28],  or  DSA  [32]. 
Since  public-key  signature  algorithms  are  computation¬ 
ally  expensive,  the  signing  of  the  response  packet  can  be¬ 
come  a  performance  bottleneck  for  the  sender.  A  simple 
trick  can  alleviate  this  situation.  The  sender  can  aggre¬ 
gate  multiple  requests,  compute  and  sign  a  Merkle  hash 
tree  that  is  generated  from  all  the  requester’s  nonces  [20]. 
Figure  5  shows  how  such  a  hash  tree  is  constructed.  If 
Nh  is  the  root  of  the  hash  tree,  JV *  would  be  included  in 
the  signed  part  of  the  response  packet  instead  of  the  re¬ 
ceiver’s  nonce  Nr .  To  verify  the  digital  signature  of  the 
response  packet,  each  receiver  would  reconstruct  the  hash 
tree.  Since  it  does  not  know  the  other  receiver’s  nonces 
that  are  part  of  the  hash  tree,  the  sender  would  include  the 
nodes  of  the  tree  necessary  to  reconstruct  the  root  node. 
For  the  example  in  figure  5,  the  packet  returned  to  receiver 
A  would  include  Nb  and  Hcd.  Receiver  A  can  reconstruct 
the  root  node  Had  from  these  values  and  its  own  nonce  Na 
as  follows:  Had  =  H(H(Na,Nb),Hcd).  Note  that  the 
number  of  nodes  returned  in  the  response  packet  is  loga¬ 
rithmic  in  the  number  of  receivers  whose  request  arrived 
in  the  same  time  interval.  Assuming  a  50  ms  interval  time 
(the  sender  would  need  to  compute  at  most  20  signatures 
per  second)  and  assuming  that  1,000,000  receivers  wanted 
to  synchronize  their  time  in  that  interval,  the  return  packet 
would  only  need  to  contain  20  hash  nodes  or  200  bytes, 
assuming  an  80  bit  hash  function.  Any  cryptographically 
secure  hash  function  can  be  used  for  H(x,  y ),  for  example 
MD5  [27],  SHA-1  [17],  or  RIPEMD-160. 


Had 
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Figure  5:  Hash  tree  over  receiver  nonces.  Node  Hab  = 
H(Na,Nb).Had  =  H(HabyHcd). 


4.1  DoS  Attack  on  the  Sender 

A  DoS  attack  on  the  sender  is  not  possible  if  TESLA 
is  used  with  indirect  time  synchronization,  because  the 


4.2  DoS  Attack  on  the  Receiver 

In  this  section,  we  discuss  two  DoS  attacks  on  the  client. 
Since  we  assume  the  attacker  could  have  full  control  of  the 


network,  some  DoS  attacks  such  as  delay  or  drop  packets 
are  always  possible.  Delay  packets  could  cause  packets  to 
violate  the  security  condition  and  hence  not  to  be  authen¬ 
ticated.  On  the  other  hand,  speeding  up  packets  does  not 
do  anything  at  all.  The  receiver  even  benefits  from  this 
since  she  might  be  able  to  use  a  chain  with  a  short  disclo¬ 
sure  delay  that  she  could  not  use  otherwise.  We  can  show 
that  replay  packets  cannot  do  much  harm  either.  First,  a 
duplicated  packet  is  only  accepted  by  the  receiver  within  a 
short  time  period,  since  the  security  condition  drops  pack¬ 
ets  if  they  are  replayed  with  a  long  delay.  Second  we  can 
prevent  the  replay  attack  by  adding  a  sequence  number  to 
each  packet  and  by  including  the  sequence  number  in  the 
MAC.  The  TESLA  protocol  in  the  network  layer  or  in  the 
application  layer  will  filter  out  duplicate  packets. 

In  the  rest  of  the  subsection,  we  discuss  some  more 
complicated  DoS  attacks  and  show  how  to  mitigate  or  pre¬ 
vent  the  attacks.  First  we  discuss  a  flooding  attack  which 
fills  up  the  receiver  buffers.  Second  we  discuss  an  attack 
that  tries  to  waste  the  receiver’s  computation  resources  by 
unnecessarily  re-computing  the  key  chain. 

DoS  on  the  Packet  Buffer 

An  powerful  attack  is  to  flood  the  multicast  group  with 
bogus  traffic.  This  attack  is  serious  because  current  multi¬ 
cast  protocols  do  not  enforce  sending  access  control.2  The 
solution  we  propose  involves  a  weak  but  efficient  and  im¬ 
mediate  authentication  method  that  offers  some  protection 
against  a  flooding  attack. 

First  if  the  receiver  has  a  certain  size  buffer,  we  show 
that  flooding  cannot  do  much  harm.  Because  the  scheme 
only  requires  the  receiver  to  buffer  packets  for  the  dura¬ 
tion  of  one  disclosure  delay  until  the  authenticity  of  the 
packets  can  be  verified,  hence  the  buffer  size  only  needs 
to  be  the  multiplication  of  the  network  bandwidth  and  the 
disclosure  delay  time.  Assuming  that  the  receiver  has 
a  10Mbps  network  connection  and  a  500ms  disclosure 
delay,  the  required  buffer  size  is  around  640kB,  which 
should  in  general  not  be  a  major  concern  with  today’s 
workstations.  Assuming  512byte  network  packets,  the 
computation  overhead  to  authenticate  the  packets  is  on  the 
order  of  1280  HMAC  computations  per  second.  Since  the 
openssl  HMAC-MD5  implementation  processes  on  the  or¬ 
der  of  120, 000  512-byte  blocks  per  second  on  a  500MHz 
Pentium  III  Linux  workstation,  the  estimated  processor 
overhead  for  TESLA  authentication  is  on  the  order  of  1% 
of  the  CPU  time. 

Second  if  the  receiver’s  buffer  size  is  not  large  enough 
as  computed  above,  flooding  could  result  in  a  DoS  attack 

2Source-Specific  Multicast  (SSM)  is  a  new  multicast  protocol,  and  a 
new  IETF  working  group  was  formed  in  August  2000  [22].  SSM  tends 
to  address  this  problem  by  enforcing  that  only  one  legitimate  sender  can 
send  to  the  multicast  group. 


because  the  receiver  would  drop  packets  due  to  a  lack  of 
buffer  space.3 

An  obvious  solution  is  to  distribute  a  shared  secret  key 
to  all  receivers  and  to  add  a  MAC  to  each  packet  with  the 
shared  secret  key.  This  enables  a  receiver  to  quickly  verify 
the  packet,  but  it  allows  an  attacker  who  knows  the  key  to 
flood  the  clients  anyhow. 

Another  approach  is  to  use  the  key  chain  as  a  weak  au¬ 
thentication  method.  Briscoe  presents  a  related  method 
for  immediate  authentication  [8].  The  receiver  pre¬ 
authenticates  the  packet  by  verifying  that  the  disclosed 
key  really  is  part  of  the  key  chain.  Based  on  the  disclosed 
key,  the  receiver  can  also  immediately  derive  the  time  in¬ 
terval  of  the  packet  and  also  immediately  verify  the  secu¬ 
rity  condition.  Both  checks  are  efficient  and  do  not  require 
any  additional  space  overhead  in  the  packet.  An  attacker 
would  need  to  receive  a  packet  from  the  sender,  extract  the 
disclosed  key,  and  use  that  key  to  flood  the  receivers.  For¬ 
tunately,  the  flooding  time  period  of  each  key  is  limited  to 
one  interval  duration. 

Yet  another  solution  is  to  use  the  immediate  authentica¬ 
tion  we  propose  in  section  3.1.  In  this  case,  the  message 
does  not  need  to  be  added  to  a  queue  if  it  is  immediately 
authenticated. 

In  practice,  the  receiver  allocates  a  queue  for  each  time 
interval  to  buffer  incoming  packets  until  they  can  be  au¬ 
thenticated.  If  the  receiver  has  too  little  memory  to  buffer 
all  incoming  traffic  during  the  disclosure  delay,  it  needs 
to  decide  on  a  drop  or  replacement  policy  in  case  of  a 
full  buffer.  Dropping  all  packets  of  a  particular  interval 
once  the  buffer  is  full  is  a  poor  policy,  because  an  attacker 
might  insert  the  spoofed  traffic  only  early  in  each  time 
interval,  causing  the  receivers  to  buffer  mostly  spoofed 
packets.  Ideally,  the  receiver  uses  a  random  replacement 
policy  once  the  buffer  is  full.  For  each  incoming  packet, 
the  receiver  picks  a  packet  within  the  buffer  to  replace. 

DoS  on  the  Key  Chain 

Another  DoS  attack  is  specific  to  how  the  TESLA  receiver 
reconstructs  the  key  chain.  If  an  attacker  could  fool  a  re¬ 
ceiver  to  believe  that  a  packet  was  sent  out  far  in  the  fu¬ 
ture,  and  the  receiver  would  try  to  verify  the  key  disclosed 
in  the  packet  by  applying  the  pseudo-random  function  un¬ 
til  the  last  committed  key  chain  value.  This  attack  can 
be  easily  prevented  by  checking  that  the  packet  interval  is 
less  or  equal  the  latest  interval  that  the  sender  can  possi¬ 
bly  be  in.  For  an  incoming  packet  sent  in  interval  Ij,  the 
receiver  can  verify  if  the  interval  Ij  is  not  in  the  future, 
i.e.  if  the  sender  can  already  be  in  that  interval.  The  ver- 

3  We  do  not  consider  the  flooding  attack  from  a  network  perspective 
(where  flooding  can  cause  link  congestion  and  results  in  dropping  legit¬ 
imate  traffic)  because  any  network  protocol  is  susceptible  to  this  attack. 


"  ification  condition  is  that  <  [fa  -  T0)/Tint J,  where 
ti  is  an  upper  bound  on  the  sender’s  time  that  the  receiver 
computes  at  the  arrival  of  the  packet. 

5  Related  Work 

Researchers  have  proposed  signing  data  packets  to 
achieve  source  authentication.  Since  a  digital  signature 
achieves  non-repudiation,  a  signature  is  much  stronger 
4han  just  authentication.  As  we  mentioned  in  the  intro¬ 
duction,  the  communication  and  computation  overhead  of 
current  signature  schemes  is  more  expensive  than  schemes 
that  are  based  on  symmetric  cryptography.  We  will  re¬ 
view  only  the  schemes  that  provide  source  authentica¬ 
tion  and  not  the  schemes  providing  non-repudiation,  i.e. 
[14, 29, 33,25]. 

The  earliest  related  work  is  by  Cheung  [11].  He  pro¬ 
poses  a  scheme  akin  to  the  basic  TESLA  protocol  to  au¬ 
thenticate  link-state  routing  updates  between  routers.  He 
assumes  that  all  the  routers  in  a  network  are  time  synchro¬ 
nized  up  to  ±e,  and  does  not  consider  the  case  of  hetero¬ 
geneous  receivers. 

Anderson  et  al.  [1]  present  the  Guy  Fawkes  protocol 
which  provides  message  authentication  between  two  par¬ 
ties.  Their  protocol  has  the  drawback  that  it  cannot  toler¬ 
ate  packet  loss.  They  propose  two  methods  to  guarantee 
that  the  keys  are  not  revealed  too  soon.  The  first  method 
is  that  the  sender  and  receiver  are  in  lockstep,  i.e.  the  re¬ 
ceiver  acknowledges  every  packet  before  the  sender  can 
send  the  next  packet.  This  severely  limits  the  sending  rate 
and  does  not  scale  to  a  large  number  of  receivers.  The  sec¬ 
ond  method  to  secure  their  scheme  is  to  time-stamp  each 
packet  at  a  time-stamping  service,  which  introduces  addi¬ 
tional  complexity  and  overhead. 

Canetti  et  al.  propose  to  use  k  different  keys  to  authen¬ 
ticate  every  message  with  k  different  MAC’s  for  sender 
authentication  [9].  Every  receiver  knows  m  keys  and  can 
hence  verify  m  MAC’s.  The  keys  are  distributed  in  such 
a  way  that  no  coalition  of  w  receivers  can  forge  a  packet 
for  a  specific  receiver.  The  communication  overhead  for 
this  scheme  is  considerable,  since  every  message  carries 
k  MAC’s.  The  server  must  also  compute  k  MACs  before 
a  packet  is  sent,  which  makes  it  more  expensive  than  the 
scheme  we  present  in  this  paper.  Furthermore,  the  security 
of  their  scheme  depends  on  the  assumption  that  at  most  a 
bounded  number  (which  is  on  the  order  of  fc)  of  receivers 
collude. 

Briscoe  proposes  the  FLAMeS  protocol  that  is  similar 
to  the  Cheung  [11]  and  part  of  the  basic  TESLA  protocol. 
Bergadano,  Cavalino,  and  Crispo  present  an  authentica¬ 
tion  protocol  for  multicast  [5].  Their  protocol  is  similar  to 
Cheung  [1 1]  and  to  parts  of  the  basic  TESLA  protocol. 

Bergadano,  Cavagnino,  and  Crispo,  propose  a  proto¬ 
col  similar  to  the  Guy  Fawkes  protocol  to  individually 


authenticate  data  streams  sent  within  a  group  [4].  Their 
scheme  requires  that  the  sender  receives  an  acknowledg¬ 
ment  packet  ffom  each  receiver  before  it  can  send  the  next 
packet.  This  prevents  scalability  to  a  large  group.  The 
advantage  is  that  their  protocol  does  not  rely  on  time  syn¬ 
chronization. 

Unfortunately,  their  protocol  is  vulnerable  to  a  man-in- 
the-middle  attack.  To  illustrate  the  attack,  we  briefly  re¬ 
view  the  protocol  for  one  sender  and  one  receiver  (adapted 
to  use  the  same  notation  as  we  established  in  this  paper): 

B  ->  A  :  KB0 ,  SN ,  { KB0 ,  SN}„-i 

nB 

A^B.Ai,  MAC(KAi ,  Ai),  KA0,  SN,  { KA0 ,  SN}K- 1 
B  ->  A  :  KBl 

A^B  :  A2,MAC(KA2, A2), KA: 

In  their  scheme,  both  A  (the  sender)  and  B  (the  receiver) 
pre-compute  a  key  chain,  KA{  and  KBit  respectively.  In 
the  following  attack,  B  intends  to  authenticate  data  from 
A,  but  we  will  show  that  the  attacker  /  can  forge  all  data. 
The  attacker  I  captures  all  messages  ffom  B  and  it  can 
pretend  to  B  that  all  the  messages  come  from  A.  To  A,  the 
attacker  I  just  pretends  to  be  itself. 

B  -*  1(A) :  KB0,  SN,  { KB0 ,  SN}„-i 
I  -+  A:  KIo,  SN,  { KI0 ,  SN}^ 

A  ->  /  :  A\,MAC(KA\,A\),  KA0,  SN,  { KA0,SN}k - 
/  A  :  Kh 

A->  I  :A2,  MAC(KA2,  A«),  KA\ 

1(A)  -+  B  :  A\,MAC(KA\,A\),  KA0,  SN,  { KA0 ,  SN}k - 

Note  that  the  attacker  I  can  forge  the  content  of  the  mes¬ 
sage  Ai  sent  to  B,  because  it  knows  the  key  KIq.  The  at¬ 
tacker  I  can  forge  the  entire  subsequent  message  stream, 
without  B  noticing. 

Another  attack  is  that  an  eavesdropper  that  records  a 
message  exchange  between  A  (sender)  and  B  (receiver) 
can  impersonate  either  A  or  B  as  a  receiver  to  another 
sender  C.  This  attack  can  be  serious  if  the  sender  performs 
access  control  based  on  the  initial  signature  packet  and  the 
revealed  key  chain.  The  attack  is  simple,  the  eavesdrop¬ 
per  only  needs  to  replay  the  initial  signatures  and  all  the 
disclosed  keys  collected. 

6  Conclusions 

In  this  paper,  we  have  presented  an  extension  to  our 
TESLA  scheme  which  provides  a  solution  to  the  source 
authentication  problem  under  the  assumption  that  the 


sender  and  receiver  are  loosely  time  synchronized.  The 
basic  TESLA  protocol  has  the  following  salient  proper¬ 
ties: 

•  Low  computation  overhead.  On  the  order  of  one 
MAC  function  computation  per  packet  for  both 
sender  and  receiver. 

•  Low  communication  overhead.  Required  is  as  lit¬ 
tle  as  one  MAC  value  per  packet.  Periodically,  the 
sender  also  needs  to  send  out  the  secret  keys. 

•  Perfect  loss  robustness.  If  a  packet  arrives  in  time, 
the  receiver  can  verify  its  authenticity  eventually  (as 
long  as  it  receives  later  packets). 

The  extensions  we  propose  in  this  paper  feature: 

•  The  basic  TESLA  scheme  provides  delayed  authen¬ 
tication.  With  additional  information  in  a  packet,  we 
show  in  this  paper  how  we  can  provide  immediate 
authentication. 

•  We  reduce  the  communication  overhead  when  mul¬ 
tiple  TESLA  instances  with  different  authentication 
delays  are  used  concurrently. 

•  We  derive  a  tight  lower  bound  on  the  disclosure  de¬ 
lay. 

•  Harden  the  sender  and  the  receiver  against  denial-of- 
service  attacks. 
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Multicast  Security:  A  Taxonomy  and  Some 
Efficient  Constructions 
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Abstract — Multicast  communication  is  becoming  the  basis  for  a  growing 
number  of  applications.  It  is  therefore  critical  to  provide  sound  security 
mechanisms  for  multicast  communication.  Yet,  existing  security  protocols 
for  multicast  offer  only  partial  solutions. 

We  first  present  a  taxonomy  of  multicast  scenarios  on  the  Internet  and 
point  out  relevant  security  concerns.  Next  we  address  two  major  security 
problems  of  multicast  communication:  source  authentication ,  and  key  revo¬ 
cation. 

Maintaining  authenticity  in  multicast  protocols  is  a  much  more  complex 
problem  than  for  unicast;  in  particular,  known  solutions  are  prohibitively 
inefficient  in  many  cases.  We  present  a  solution  that  is  reasonable  for  a 
range  of  scenarios.  Our  approach  can  be  regarded  as  a  ‘midpoint*  between 
traditional  Message  Authentication  Codes  and  digital  signatures.  We  also 
present  an  improved  solution  to  the  key  revocation  problem. 

I.  Introduction 

The  popularity  of  multicast  has  grown  considerably  with  the 
wide  use  of  the  Internet.  Examples  include  Internet  video  trans¬ 
missions,  news  feeds,  stock  quotes,  software  updates,  live  multi¬ 
party  conferencing,  on-line  video  games  and  shared  white¬ 
boards.  Yet,  security  threats  on  the  Internet  have  flourished  as 
well.  Thus  the  need  for  secure  and  efficient  multicast  protocols 
is  acute. 

Multicast  security  concerns  are  considerably  more  involved 
than  those  regarding  point-to-point  communication.  Even  deal¬ 
ing  with  the  ‘standard’  issues  of  message  authentication  and 
secrecy  becomes  much  more  complex;  in  addition  other  con¬ 
cerns  arise,  such  as  access  control,  trust  in  group  centers,  trust 
in  routers,  dynamic  group  membership,  and  others. 

A  trivial  solution  for  secure  multicast  is  to  set  up  a  secure 
point-to-point  connection  between  every  two  participants  (say, 
using  the  IP-Sec  protocol  suite  [17]).  But  this  solution  is  pro¬ 
hibitively  inefficient  in  most  multicast  scenarios.  In  particular, 
it  obviates  the  use  of  multicast  routing.  Instead,  we  are  looking 
for  solutions  that  mesh  well  with  current  multicast  routing  pro¬ 
tocols,  and  that  have  as  small  overhead  as  possible.  In  particular, 
a  realistic  solution  must  maintain  the  current  way  by  which  data 
packets  are  being  routed;  yet  additional  control  messages  can  be 
introduced,  for  key  exchange  and  access  control. 

This  work.  First,  we  present  a  taxonomy  of  multicast  security 
concerns  and  scenarios,  with  a  strong  emphasis  on  IP  multicast1 . 
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It  soon  becomes  clear  that  the  scenarios  are  so  diverse  that  there 
is  little  hope  for  a  unified  security  solution  that  accommodates 
all  scenarios.  Yet  we  suggest  two  ‘benchmark’  scenarios  that, 
besides  being  important  on  their  own,  have  the  property  that  so¬ 
lutions  for  these  scenarios  may  be  a  good  basis  in  other  settings. 
In  a  nutshell,  one  scenario  involves  a  single  sender  (say,  an  on¬ 
line  stock-quotes  distributor)  and  a  large  number  of  recipients 
(say,  hundreds  of  thousands).  The  second  scenario  is  on-line  vir¬ 
tual  conferencing  involving  up  to  few  hundreds  of  participants, 
where  many  (or  all)  of  the  participants  may  be  sending  data  to 
the  group. 

Next  we  concentrate  on  a  problem  that  emerges  as  a  serious 
bottleneck  in  multicast  security:  source  and  message  authenti¬ 
cation.  Known  attempts  to  solve  multicast  security  problems 
(e.g.,  [16],  [22],  [3],  [28],  [29],  [21])  concentrate  on  the  task 
of  sharing  a  single  key  among  the  multicast  group  members. 
These  solutions  are  adequate  for  encrypting  messages  so  that 
only  group  members  can  decrypt.  However,  the  single  shared 
key  approach  is  inadequate  for  source  authentication,  since  a 
key  shared  among  all  members  cannot  be  used  to  differentiate 
among  senders  in  the  group.  In  fact,  the  only  known  solutions 
for  multicast  authentication  involve  heavy  use  of  public  key  sig¬ 
natures  —  and  these  involve  considerable  overhead,  especially 
in  the  work  needed  to  generate  signatures. 

We  present  solutions  to  the  source  authentication  problem 
based  on  shared  key  mechanisms  (namely,  Message  Authenti¬ 
cation  Codes  —  MACs),  where  each  member  has  a  different  set 
of  keys.  We  first  present  a  basic  scheme  and  then  gradually  im¬ 
prove  it  to  a  scheme  that  outperforms  public-key  signatures  in 
several  common  scenarios.  Our  main  savings  are  in  the  time  to 
generate  signatures. 

The  basic  source  authentication  scheme  for  a  single  sender 
draws  from  ideas  of  [2],  [11]:  the  sender  holds  a  set  of  t  keys 
and  attaches  to  each  packet  t  MACs  -  each  MAC  computed  with 
a  different  key.  Each  recipient  holds  a  subset  of  the  t  keys  and 
verifies  the  MAC  according  to  the  keys  it  holds.  Appropriate 
choice  of  subsets  insures  that  with  high  probability  no  coalition 
of  up  to  w  colluding  bad  members  (where  w  is  a  parameter) 
know  all  the  keys  held  by  a  good  member,  thus  authenticity  is 
maintained.  We  present  several  enhancements  to  this  authenti¬ 
cation  scheme: 

•  A  considerable  gain  in  the  computational  overhead  of  the 
authentication  scheme  is  achieved  by  noticing  that  the  work 
needed  for  computing  some  known  MAC  functions  on  the  same 
input  and  t  different  keys  is  far  less  than  the  i  times  the  work  to 
compute  a  single  MAC.  This  is  so  since  the  message  can  first  be 
hashed  to  a  short  string  using  key-less  collision-resistant  hash¬ 
ing. 

•  Using  similar  parameters  to  those  of  the  basic  scheme,  one 
can  guarantee  that  each  good  member  has  many  keys  that  are 
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known  only  to  itself  and  to  the  sender.  In  order  to  break  the 
scheme  an  adversary  has  to  forge  all  the  MACs  computed  with 
these  keys.  Thus  it  is  enough  that  the  sender  attaches  to  the 
message  only  a  single  bit  out  of  each  generated  MAC  (as  long 
as  this  bit  cannot  be  successfully  ‘predicted’  without  knowing 
the  key  -  see  elaboration  within).  Consequently,  the  total  length 
of  the  tag  attached  to  the  message  can  be  reduced  to  only  l  bits. 
(Also,  such  MAC  functions  may  be  more  efficient  than  regular 
MACs.) 

•  A  very  similar  method  allows  for  many  senders  to  use  the 
same  structure  of  keys  —  each  sender  will  hold  a  different  sub¬ 
set  of  keys,  making  sure  that  with  high  probability  each  sender- 
recipient  pair  shares  a  sufficient  number  of  keys  that  are  not 
known  to  any  (small  enough)  bad  coalition. 

•  It  is  further  possible  to  increase  security  by  making  sure  that 
no  coalition  of  senders  can  forge  messages,  only  large  coalitions 
of  recipients  can.  This  property  is  beneficial  when  the  recip¬ 
ients  are  relatively  trusted  (say,  these  are  network  routers).  It 
is  achieved  by  differentiating  between  primary  and  secondary 
keys.  A  sender  only  receives  secondary  keys,  while  primary 
keys  are  only  held  by  the  recipients.  Each  secondary  key  is  de¬ 
rived  by  applying  a  pseudorandom  function  (e.g.,  a  block  cipher 
or  keyed  hash),  keyed  by  the  corresponding  primary  key,  to  the 
sender’s  public  identity.  Each  recipient  can  now  compute  the 
relevant  secondary  keys  and  verify  the  MACs;  yet,  no  coalition 
of  senders  knows  even  a  single  key  other  than  its  legitimate  set 
of  keys. 

Finally,  we  consider  the  membership  revocation  problem. 
When  a  member  leaves  a  multicast  group  it  might  be  required  to 
change  the  group  key  in  a  way  that  the  leaving  member  does  not 
leam  the  new  key.  A  relatively  efficient  solution  to  this  problem 
has  been  recently  proposed  [28],  [29].  We  present  an  improve¬ 
ment  to  this  solution,  that  saves  half  of  the  communication  over¬ 
head.  (When  a  new  member  joins,  the  group  might  have  to  be 
re-keyed  as  well,  in  order  to  prevent  the  joining  member  from 
understanding  previous  group  communication.  This  is  a  much 
simpler  task:  the  group  controller  simply  multicasts  the  new  key 
encrypted  with  the  previous  group  key.) 

Organization.  In  Section  II  we  list  and  discuss  multicast  se¬ 
curity  issues,  in  several  common  scenarios.  In  Section  III  we 
present  our  multicast  authentication  schemes,  and  in  Section  IV 
we  present  our  improvements  over  past  mechanisms  for  mem¬ 
bership  revocation. 

II.  Multicast  Security  Issues 

We  overview  salient  characteristics  of  multicast  scenarios, 
and  discuss  the  relevant  security  concerns.  The  various  scenar¬ 
ios  and  concerns  are  quite  diverse  in  character  (sometimes  they 
are  even  contradictory).  Thus  it  seems  unlikely  that  a  single  so¬ 
lution  will  be  satisfactory  for  all  multicast  scenarios.  This  situa¬ 
tion  leads  us  to  suggest  two  benchmark  scenarios  for  developing 
secure  multicast  solutions. 

Multicast  group  characteristics.  We  list  salient  parameters  that 
characterize  multicast  groups.  These  parameters  affect  in  a  cru¬ 
cial  way  which  security  architecture  should  be  used.  The  group 
size  can  vary  from  several  tens  of  participants  in  small  dis¬ 
cussion  groups,  through  thousands  in  virtual  conferences  and 


classes,  and  up  to  several  millions  in  large  broadcasts.  Member 
characteristics  include  computing  power  (do  all  members  have 
similar  computing  power  or  can  some  members  be  loaded  more 
than  others?)  and  attention  (are  members  on-line  at  all  times?). 

A  related  parameter  is  membership  dynamics:  Is  the  group 
membership  static  and  known  in  advance?  Otherwise,  do  mem¬ 
bers  only  join,  or  do  members  also  leave?  How  frequently  does 
membership  change  and  how  fast  should  changes  become  effec¬ 
tive?  Also,  is  there  a  membership  control  center  that  has  infor¬ 
mation  about  group  membership?  Finally  what  is  the  expected 
life  time  of  the  group  (several  minutes/days/unbounded)? 

Next,  what  is  the  number  and  type  of  senders ?  Is  there  a  single 
party  that  sends  data?  Several  such  parties?  All  parties?  Is  the 
identity  of  the  senders  known  in  advance?  Are  non-members 
expected  to  send  data? 

Another  parameter  is  the  volume  and  type  of  traffic:  Is  there 
heavy  volume  of  communication?  Must  the  communication  ar¬ 
rive  in  real-time?  What  is  the  allowed  latency?  For  instance, 
is  it  data  communication  (less  stringent  real-time  requirements, 
low  volume),  audio  (must  be  real-time,  low  volume)  or  video 
(real-time,  high  volume)?  Also,  is  the  traffic  bursty? 

Another  parameter  that  may  become  relevant  is  the  routing 
algorithm  used.  For  instance,  a  security  mechanism  may  interact 
differently  with  dense-mode  and  sparse-mode  routing.  Also,  is 
all  routing  done  via  a  single  server  or  is  it  distributed? 

Security  requirements.  The  most  basic  security  requirements 
are  secrecy  and  authenticity.  Secrecy  usually  means  that  only 
the  multicast  group  members  (and  all  of  them)  should  be  able  to 
decipher  transmitted  data.  We  distinguish  two  types  of  secrecy: 
Ephemeral  secrecy  means  preventing  non  group-members  from 
easy  access  to  the  transmitted  data.  Here  a  mechanism  that  only 
delays  access  may  be  sufficient.  Long-term  secrecy  means  pro¬ 
tecting  the  confidentiality  of  the  data  for  a  long  period  of  time. 
This  type  of  secrecy  is  often  not  needed  for  multicast  traffic. 

Authenticity  may  take  two  flavors:  Group  authenticity  means 
that  each  group  member  can  recognize  whether  a  message  was 
sent  by  a  group  member.  Source  authenticity  means  that  it  is 
possible  to  identify  the  particular  sender  within  the  group.  It 
may  be  desirable  to  be  able  to  verify  the  origin  of  messages  even 
if  the  originator  is  not  a  group  member. 

Other  concerns  include  several  flavors  of  anonymity  (e.g., 
keeping  the  identity  of  group  members  secret  from  outsiders 
or  from  other  group  members,  or  keeping  the  identity  of  the 
sender  of  a  message  secret).  A  related  concern  is  protection 
from  traffic  analysis .  A  somewhat  contradictory  requirement  is 
non-repudiation,  or  the  ability  of  receivers  of  data  to  prove  to 
third  parties  that  the  data  has  been  transmitted. 

Access  control ,  or  making  sure  that  only  registered  and  legiti¬ 
mate  parties  have  access  to  the  communication  addressed  to  the 
group,  is  usually  obtained  by  maintaining  ephemeral  secrecy  of 
the  data.  Enforcing  access  control  also  involves  authenticating 
potential  group  members.  The  access  control  problem  becomes 
considerably  more  complex  if  members  may  join  and  leave  with 
time. 

Lastly,  maintaining  service  availability  is  ever  more  relevant 
in  a  multicast  setting,  since  clogging  attacks  are  easier  to  mount 
and  are  much  more  harmful.  Here  protection  must  include 
multicast-enabled  routers  as  well  as  end-hosts. 
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Trust  issues.  In  simple  scenarios  there  is  a  natural  group  owner 
that  can  be  trusted  to  manage  the  group  security.  Typical  roles 
are  access  control,  logging  traffic  and  usage,  and  key  manage¬ 
ment.  (It  may  be  convenient,  but  not  necessary,  to  identify  the 
group  owner  with  the  core  used  in  some  multicast  routing  proto¬ 
cols,  e.g.  in  [3].)  In  other  cases  no  single  entity  is  totally  trusted; 
yet  different  entities  can  be  trusted  to  perform  different  tasks  (for 
instance,  the  access-control  entity  may  be  different  than  the  en¬ 
tity  that  distributes  keys).  In  addition,  basing  the  security  of  the 
entire  group  on  a  single  service  makes  the  system  more  vulner¬ 
able.  Thus  it  is  in  general  beneficial  to  distribute  the  security 
tasks  as  much  as  possible. 

A  natural  approach  for  distributing  trust  in  multicast  security 
centers  is  to  use  threshold  cryptography  (9],  [13]  and  proac¬ 
tive  security  [7]  techniques  to  replace  a  single  center  with  a  dis¬ 
tributed  service  with  no  single  point  of  failure.  This  is  an  inter¬ 
esting  topic  for  future  research. 

Performance.  Performance  is  a  major  concern  for  multicast  se¬ 
curity  applications.  The  most  immediate  costs  that  should  be 
minimized  are  the  latency  and  work  overhead  per  sending  and 
receiving  data  packets,  and  the  bandwidth  overhead  incurred  by 
inflating  the  data  packets  via  cryptographic  transformations.  Se¬ 
cure  memory  requirement  (e.g.,  lengths  of  keys)  is  a  somewhat 
less  important  resource,  but  should  also  be  minimized.  Here 
distinction  should  be  made  between  the  load  on  strong  server 
machines  and  on  weak  end-users. 

Other  performance  overheads  to  be  minimized  include  the 
group  management  activity  such  as  group  initialization  and 
member  addition  and  deletion.  Here  member  deletion  may 
cause  severe  overhead  since  keys  must  be  changed  in  order  to 
ensure  revocation  of  the  cryptographic  abilities  of  the  deleted 
members.  We  elaborate  in  Section  IV. 

An  additional  concern  is  possible  congestion,  especially 
around  centralized  control  services  at  peak  sign-on  and  sign-off 
times.  (A  quintessential  scenario  is  a  real-time  broadcast  where 
many  people  join  right  before  the  broadcast  begin  and  leave  right 
after  it  ends.)  Another  performance  concern  is  the  work  incurred 
when  a  group  member  becomes  active  after  being  dormant  (say, 
off-line)  for  a  while. 

Benchmark  Scenarios 

As  seen  above,  it  takes  many  parameters  to  characterize  a 
multicast  security  scenario,  and  a  large  number  of  potential  sce¬ 
narios  exist.  Each  scenario  calls  for  a  different  solution;  in  fact, 
the  scenarios  are  so  different  that  it  seems  unlikely  that  a  single 
solution  will  accommodate  all.  This  is  in  sharp  contrast  with  the 
case  of  unicast  security,  where  a  single  architectural  approach 
(public-key  based  exchange  of  a  key,  followed  by  authenticat¬ 
ing  and  encrypting  each  packet  using  derived  keys)  is  sufficient 
for  most  scenarios. 

In  this  section  we  present  two  very  different  scenarios  for 
secure  multicast,  and  sketch  possible  solutions  and  challenges. 
These  scenarios  seem  to  be  the  ones  that  require  most  urgent 
solutions;  in  addition,  they  span  a  large  fraction  of  the  concerns 
described  above,  and  solutions  here  may  well  be  useful  in  other 
scenarios  as  well.  Thus  we  suggest  these  scenarios  as  bench¬ 
marks  for  evaluating  security  solutions. 


Single  source  broadcast.  Consider  a  single  source  that  wishes 
to  continuously  broadcast  data  to  a  large  number  of  recipients 
(e.g.  a  news  agency  that  broadcasts  news-feeds  and  stock-quotes 
to  paying  customers).  Such  applications  are  common  in  the  In¬ 
ternet  today,  but  they  still  typically  rely  on  unicast  routing  and 
have  few  or  no  security  protections. 

Here  the  number  of  recipients  can  be  hundreds  of  thousands 
or  even  millions.  The  source  is  typically  a  top-end  machine  with 
ample  resources.  It  can  also  be  parallelized  or  even  split  into 
several  sources  in  different  locations.  The  recipients  are  typi¬ 
cally  lower-end  machines  with  limited  resources.  Consequently, 
any  security  solution  should  be  optimized  for  efficiency  at  the 
recipient  side. 

Although  the  life-time  of  the  group  is  usually  very  long  group 
membership  is  typically  dynamic:  members  join  and  leave  at  a 
relatively  high  rate.  In  addition,  at  peak  times  (say,  before  and 
after  important  broadcasts)  a  high  volume  of  sign-on/sign-off 
requests  are  expected. 

The  volume  of  transmitted  data  may  change  considerably:  if 
only  text  is  being  transmitted  then  the  volume  is  relatively  low 
(and  the  latency  requirements  are  quite  relaxed);  if  audio/video 
is  transmitted  (say,  in  on-line  pay-TV)  then  the  volume  can  be 
very  high  and  very  little  latency  is  allowed. 

Authenticity  of  the  transmitted  data  is  a  crucial  concern  and 
should  be  strictly  maintained:  a  client  must  never  accept  a 
forged  stock-quote  as  authentic.  Another  important  concern  is 
preventing  non-members  from  usin i  the  service.  This  can  be 
achieved  by  encrypting  the  data;  yet  the  encryption  may  be  weak 
since  there  is  no  real  secrecy  requirement,  only  prevention  from 
easy  unauthorized  use.  Regarding  trust,  here  there  is  typically  a 
natural  group  owner  that  manages  access-control  as  well  as  key 
management.  However,  the  sender  of  data  may  be  a  different 
entity  (say,  Yahoo!  broadcasting  Reuters  news). 

A  natural  solution  for  this  scenario  may  have  a  group  manage¬ 
ment  center  that  handles  access  control  and  key  management. 
(To  scale  the  solution  to  a  larger  number  of  recipients  the  center 
can  be  distributed,  or  a  hierarchal  structure  can  be  introduced.) 
It  is  stressed  that  the  center  handles  only  ‘control  traffic’.  The 
data  packets  are  routed  using  current  multicast  routing  proto¬ 
cols.  Encryption  can  be  done  using  a  single  key  shared  by  all 
members.  Yet,  two  main  cryptographic  problems  remain:  How 
to  authenticate  messages,  and  how  to  make  sure  that  a  leaving 
member  loses  its  ability  to  decrypt. 

A  simple  and  popular  variant  of  this  scenario,  file  transmis¬ 
sion  and  updates,  typically  has  static  group  membership  and 
does  not  require  on-line  delivery  of  data. 

Virtual  Conferences.  Typical  virtual  conference  scenarios  in¬ 
clude  on-line  meetings  of  corporate  executives  or  committees, 
town-hall  type  meetings,  interactive  lectures  and  classes,  and 
multiparty  video  games.  A  virtual  conference  involves  several 
tens  to  hundreds  of  peers,  often  with  roughly  similar  compu¬ 
tational  resources.  Usually  most,  or  all,  group  members  may 
a-priori  wish  to  transmit  data  (although  often  there  is  a  small  set 
of  members  that  generate  most  of  the  bandwidth). 

The  group  is  often  formed  per  event  and  is  relatively  short¬ 
lived.  Membership  is  usually  static:  members  usually  join  at 
start-up,  and  remain  signed  on  throughout.  Furthermore,  even 
if  a  member  leaves,  cryptographically  disconnecting  it  from  the 
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group  is  often  not  crucial.  Bandwidth  and  latency  requirements 
vary  from  application  to  application,  similarly  to  the  case  of  sin¬ 
gle  source  broadcast. 

Authenticity  of  data  and  sender  is  the  most  crucial  security 
concern.  In  some  scenarios  maintaining  secrecy  of  data  and 
anonymity  of  members  may  be  crucial  as  well;  in  many  other 
scenarios  secrecy  of  data  is  not  a  concern  at  all.  Although  there 
is  often  a  natural  group  owner  that  may  serve  as  a  trusted  center, 
it  beneficial  to  distribute  trust  as  much  as  possible. 

Also  here  a  simple  approach  to  a  solution  uses  a  server  that 
handies  access  control  and  key  management.  Encryption,  when 
needed,  can  be  dealt  with  as  above.  Yet,  the  performance  re¬ 
quirements  from  the  authentication  mechanism  are  very  differ¬ 
ent  In  particular,  in  contrast  with  the  single  sender  scenario, 
here  signing  data  packets  may  be  prohibitively  slow  on  the 
sender’s  machine.  In  addition,  there  are  far  less  receivers,  and 
the  group  members  may  be  somewhat  more  trustworthy.  Virtual 
conferencing  applications  are  also  typically  more  tolerant  to  oc¬ 
casional  and  local  authentication  errors.  These  considerations 
point  to  an  alternative  approach  to  solving  the  multicast  authen¬ 
tication  problem.  In  the  next  section  we  describe  this  alternative 
approach. 

III.  Efficient  Authentication  Schemes 

We  concentrate  on  two  approaches  to  authentication:  public 
key  signatures,  and  MACs.  (We  do  not  address  information- 
theoretic  authentication  mechanisms,  such  as  [10],  [25],  [6], 
which  are  inherently  inefficient  for  groups  of  non-trivial  size.) 

Public  key  signatures  are  perhaps  the  most  natural  mechanism 
for  multicast  authentication.  Yet,  signatures  are  typically  long, 
and  computing  and  verifying  each  signature  requires  a  signifi¬ 
cant  computational  overhead.  Applying  signatures  to  authenti¬ 
cate  streams  of  data  was  investigated  in  [14],  who  proposed  a 
chaining  mechanism  that  requires  a  single  signature  per  stream. 
These  constructions  do  not  tolerate  packet  loss,  and  are  thus  in¬ 
compatible  with  IP  multicast.  Alternatively,  [30]  suggested 
using  tree-based  hashing  to  authenticate  streams.  This  approach 
is  a  little  less  efficient,  and  incurs  some  latency,  but  it  better  tol¬ 
erates  packet  loss. 

As  an  alternative  to  public  key  signatures,  we  propose  an 
authentication  method  based  on  message  authentication  codes 
(MACs).  A  MAC  is  a  function  which  takes  a  secret  key  k  and  a 
message  M  and  returns  a  value  MAC(fc,  M).  Very  informally, 
a  MAC  scheme  is  unforgeable  if  an  adversary  that  sees  a  se¬ 
quence  {M^  MAC {k,  Mi)}  where  the  Mi's  are  adaptively  cho¬ 
sen,  but  does  not  know  k ,  has  a  negligible  probability  to  generate 
MAC (fc,M)  for  anym  £  {Mi}. 

While  MACs  are  typically  much  more  efficient  to  generate 
and  verify  than  digital  signatures,  they  require  that  all  potential 
verifiers  have  access  to  a  shared  key,  k.  This  property  makes 
MACs  seemingly  insufficient  for  achieving  source  authentica¬ 
tion:  any  potential  receiver  who  has  the  key  k  can  “impersonate” 
the  sender.  We  present  new  MAC-based  authentication  methods 
which  achieve  source  authentication,  and  are  more  efficient  than 
public  key  based  authentication  (especially  in  the  time  to  gener¬ 
ate  signatures).  We  first  present  a  description  of  a  basic  scheme, 
followed  by  several  variants  and  improvements  (see  sketch  in 
the  Introduction). 


We  analyze  the  following  salient  resources  for  all  the  schemes 
we  present:  The  running  time  required  to  authenticate  a  mes¬ 
sage  and  to  verify  an  authentication,  denoted  T§  and  Tv ,  respec¬ 
tively.  The  length  of  the  keys  that  the  authenticator  and  the  veri¬ 
fier  should  store,  denoted  Ms  and  My ,  respectively.  The  length 
of  the  authentication  message  (the  MAC  or  the  signature),  de¬ 
noted  C.  These  resources  are  obviously  related  to  the  latency, 
secure  memory  and  bandwidth  overhead  parameters  discussed 
in  Section  II. 

Per-message  unforgability  of  MAC  schemes.  We  distinguish 
between  two  types  of  attacks  against  a  MAC  scheme.  One  is 
a  complete  break,  where  the  attacker  can  authenticate  any  mes¬ 
sage  of  its  choice  (e.g.,  a  key  recovery  attack).  The  other  attack 
allows  the  attacker  to  randomly  authenticate  false  messages; 
here  the  attacker  can  authenticate  a  given  message  with  some 
fixed  and  small  probability  (but  does  not  know  a-priori  whether 
it  will  be  able  to  authenticate  the  message).  Our  schemes  do  not 
allow  complete  break  with  higher  probability  than  the  underly¬ 
ing  MAC  scheme.  Yet,  we  do  allow  for  random  authentication 
errors  with  non-negligible  probability  (say,  2-20  up  to  2“10). 

A  bit  more  formally,  we  say  that  a  MAC  scheme  is  q-per- 
message  unforgeable  if  no  (probabilistic  polynomial-time)  ad¬ 
versary  has  a  positive  expected  payoff  in  the  following  guessing 
game:  the  adversary  can  ask  to  receive  the  output  of  the  MAC 
on  a  sequence  of  messages  mi, . . . ,  m*  of  its  choice,  and  then 
decide  to  quit  or  to  gamble.  If  it  quits  it  receives  a  payment  of 
$0.  Otherwise,  it  chooses  a  message  m  g  {mi, .. .  ,mfc}  and 
tries  to  guess  the  value  of  the  MAC  on  m.  The  adversary  re¬ 
ceives  $(1  -  q)  if  its  guess  is  correct,  and  pays  S q  otherwise. 
In  other  words  the  adversary  may  guess  correctly  the  value  of 
the  MAC  with  probability  at  most  q ,  but  (except  with  negligible 
probability)  won’t  “know”  whether  its  guess  is  correct. 

We  believe  that  for  most  systems  a  small  (although  non- 
negligible)  per-message  unforgeability  (say,  q  =  2-20)  is  suf¬ 
ficient.  Note  that  per-message  unforgeability  is  a  weaker  secu¬ 
rity  property  than  standard  unforgeability,  in  the  sense  that  any 
scheme  that  is  unforgeable  in  the  standard  sense  is  also  q- per- 
message  unforgeable  (for  any  non-negligible  value  of  q).  The 
converse  does  not  necessarily  hold. 

A.  The  Basic  Authentication  Scheme  for  a  Single  Source 

Let  w  be  the  maximum  number  of  corrupted  users.  The  basic 
scheme  proceeds  as  follows: 

•  The  source  of  the  transmissions  (5)  knows  a  set  off  =  e(w  + 
1) ]n{l/q)  keys,  R={KU..., Kt). 

•  Each  recipient  u  knows  a  subset  of  keys  Ru  C  R.  Every  key 
Ki  is  included  in  Ru  with  probability  l/(w  + 1),  independently 
for  every  i  and  u  2. 

•  Message  M  is  authenticated  by  S  with  each  key  Ki  using  a 
MAC  and  (MAC(KuM),UAC{K2lM),. .  .,MAC(^,M)) 
is  transmitted  together  with  the  message. 

•  Each  recipient  u  verifies  all  the  MACs  which  were  created 
using  the  keys  in  its  subset  Ru.  If  any  of  these  MACs  is  incorrect 
then  u  rejects  the  message. 

2 Notice  that  this  can  be  accomplished  by  using  a  (w  +  l)-wise  independent 
mapping  from  users  to  subsets. 
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The  performance  parameters  are  the  following.  The  source 
must  hold  Ms  =  £  —  e(w  4- 1)  ln(l/g)  basic  MAC  keys.  Each 
receiver  expects  to  hold  My  =  e  ln(l/g)  MAC  keys.3  The  com¬ 
munication  overhead  per  message  is  C  —  e(w  +  l)ln(l /q) 
MAC  outputs.  The  running  time  overhead  is  Ts  =  e(w  -f 
l)ln(l/g)  MAC  computations  for  the  source  and  only  Ty  = 
eln(l  fq)  MAC  computations  for  a  receiver. 

Theorem  1:  Assume  that  the  probability  of  computing  the 
output  of  a  MAC  without  knowing  the  key  is  at  most  g'.  Let 
u  be  a  user.  Then  the  probability  that  a  coalition  of  w  corrupt 
users  can  authenticate  a  message  M  to  u  is  at  most  q  +  q*  (the 
probability  is  taken  over  the  choice  of  key  subsets  and  over  the 
message)4. 

Proof  (sketch):  For  every  user  u  and  any  coalition  of  w 
users,  the  probability  that  a  specific  key  is  good  (i.e.  con¬ 
tained  in  a  user’s  subset,  but  not  in  the  subset  of  any  of  the 

(\  w 

1  -  J  = 

(w+i)(n-±y  >  Therefore,  the  probability  that  Ru 

is  completely  covered  by  the  subsets  held  by  the  coalition  mem¬ 
bers  is  (1  -g)*  <  (1  -  -^)e(«>+i>in(i/«)  <  e-M1 * */*)  =  q. 

If  Ru  is  not  covered,  the  set  { MAC(Ki ,  M)}^ru  contains  at 
least  one  MAC  for  which  the  coalition  does  not  know  K{.  The 
probability  of  computing  it  correctly  is  at  most  q(.  By  union 
bound,  the  probability  that  the  coalition  can  authenticate  M  to 
u  is  at  most  q  +  qf.  D 

Notice  that  when  the  keys  of  a  user  are  not  covered  by 
the  coalition,  the  coalition  cannot  check  in  advance  (off-line) 
whether  it  can  authenticate  a  specific  message.  Therefore  the 
probability  q '  of  authenticating  a  message  by  breaking  a  MAC 
can  be  rather  large  (e.g.  even  q'  =  2“ 10  might  be  reasonable  for 
many  applications). 

A  nice  feature  of  this  construction  is  that  the  complexity  does 
not  depend  on  the  total  number  of  parties  but  rather  only  on 
the  maximum  size  of  a  corrupt  coalition  and  the  allowed  error 
probability.  We  remark  that  a  similar  idea  was  previously  used 
by  Fiat  and  Naor  for  broadcast  encryption  (described  in  [2])  and 
by  Dyer  et  al.  for  pairwise  encryption  [11]. 

The  security  is  against  an  arbitrary,  but  fixed,  coalition  of  up 
to  w  corrupt  recipients.  Notice  that  it  is  possible  to  construct 
schemes  which  are  secure  against  any  coalition  of  size  w  as  fol¬ 
lows.  Let  q  =  (n  *  (JJ))"1  (i.e.  1  over  the  number  of  possible 
combinations  of  coalitions  and  users).  By  a  probabilistic  argu¬ 
ment,  there  exists  a  system  for  n  recipients  in  which  the  subset 
of  no  user  is  covered  by  the  union  of  the  subsets  of  a  coalition 
of  size  w.  The  system  has  a  total  of  less  than  e(w  4- 1)2  Inn) 
keys,  and  each  recipient  has  a  subset  of  expected  size  less  than 
e(w  4- 1)1  nn. 

B.  Smaller  Communication  Overhead 

We  now  describe  a  scheme  with  a  lower  communication  over¬ 
head.  The  idea  behind  it  is  that  using  just  four  times  as  many 

3  A  straightforward  modification  of  this  scheme  allows  each  member  to  have 
a fixed  number  of  keys. 

4  A  similar  result  holds  with  respect  to  per-message  unforgability.  That  is,  if 

the  MAC  is  g* -per-message  unforgeable  then  for  any  user  u  and  coalition  of 

other  w  corrupt  users,  it  holds  with  probability  1  —  q  that  the  resulting  scheme 

is  q' -per-message  unforgeable  with  respect  to  the  coalition  and  the  user. 


keys  as  in  the  basic  scheme,  one  can  ensure  that  the  coalition 
does  not  know  log(l/g)  of  the  user’s  keys.  Each  key  can  there¬ 
fore  be  used  to  produce  a  MAC  with  a  single  bit  output  and 
the  communication  overhead  is  improved.  The  coalition  would 
have  to  guess  log(l/g)  bits  to  create  a  false  authentication  and 
its  probability  of  success  is  as  before. 

Recall  the  basic  scheme:  it  limits  the  success  probability  of 
a  corrupt  coalition  to  be  q  4-  q\  where  q‘  is  the  per-message 
unforgeability.  The  MAC  output  must  be  at  least  log2(l/g/)  bits 
long.  Therefore,  assuming  qf  =  g,  the  communication  overhead 
is  C  >  e(w  4- 1)  ln2(l/g)  bits.  The  improved  scheme  achieves 
a  communication  overhead  smaller  than  4 e(w  4- 1)  ln(l/g)  bits. 

The  improved  scheme  uses  a  MAC  with  a  single  bit  out¬ 
put.  (Current  constructions  of  MACs  have  much  larger  out¬ 
puts,  but  our  schemes  can  use  a  single  bit  of  this  output.  It 
might  also  be  possible  to  design  a  special-purpose  MAC  func¬ 
tion  with  a  single  bit  output,  which  would  be  more  efficient  than 
standard  constructions.)  For  simplicity  of  exposition,  assume 
that  for  this  MAC  q'  =  1/2.  If  the  keys  of  a  corrupt  coalition 
do  not  cover  log(l /q)  keys  of  a  user’s  subset,  then  the  prob¬ 
ability  that  the  user  accepts  an  unauthentic  message  from  the 
coalition  is  at  most5  q.  In  the  suggested  scheme  the  source  uses 
£  =  4e(w  4-  1)  ln(l/g)  keys  where  each  key  is  included  in  a 
user’s  subset  with  probability  l/(w  4- 1). 

All  performance  parameters  are  multiplied  by  four.  The 
source  must  store  Ms  —  £  —  4e(w  4-  l)ln(l/g)  basic  MAC 
keys.  Each  receiver  expects  to  store  My  =  4eln(l/g)  ba¬ 
sic  MAC  keys.  The  communication  overhead  is  C  =  4e(u;  4- 
1)  ln(l/g)  bits  per  message.  The  source  must  compute  4 e(w  4- 
l)ln(l/g)  MACs,  whereas  each  receiver  expect  to  compute 
4eln(l /q)  MACs. 

Theorem  2:  Consider  a  MAC  with  a  single  bit  output  that 
is  | -per-message  unforgeable,  and  consider  the  above  scheme 
using  this  MAC  and  £  =  Ae(w  4- 1)  ln(l/g)  keys.  Then  for  ev¬ 
ery  user  u  and  coalition  of  other  w  corrupt  users,  it  holds  with 
probability  1  -  q  that  the  resulting  scheme  is  g-per-message  un¬ 
forgeable  (with  respect  to  the  coalition  and  u). 

Proof  (sketch):  The  probability  that  a  specific  key  is  good 
is  g  >  efa+x)  as  before.  Since  the  MAC  is  |-per-message  un¬ 
forgeable,  the  coalition  cannot  guess  with  probability  better  than 
1/2  the  output  of  a  MAC  whose  key  it  does  not  know.  There¬ 
fore  the  expected  success  probability  of  a  corrupt  coalition  is 

E-=o  (jyc1  -  =  (i  -  9/2Y  <  f-  By Markov 

inequality,  with  probability  at  most  q  the  coalition  has  a  proba¬ 
bility  greater  than  q  to  compute  all  MACs  with  key  in  Ru.  In 
other  words,  with  probability  1  —  q  the  scheme  is  g-per-message 
unforgeable.  D 

C.  Multiple  Dynamic  Sources 

The  schemes  presented  above  can  be  easily  extended  to  en¬ 
able  any  party  to  send  authenticated  messages.  The  global  set  of 
£  keys  is  w  4- 1  times  bigger  than  in  the  single  source  scheme, 
and  every  party  receives  a  random  subset  Ru  of  these  keys.  Keys 

5  More  formally,  assume  that  it  is  not  possible  to  distinguish  in  polynomial 
time  between  the  output  of  the  MAC  and  a  random  bit  with  probability  better 
than  1/2  4  e.  Then  (see  [23])  one  can  use  a  “hybrid  argument”  to  show  that  it  is 
not  possible  to  distinguish  between  m  MAC  outputs  and  an  m  bit  random  string 
with  probability  better  than  1/2  -j-  me. 
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are  included  Ry  independently  at  random  with  probability 
When  a  party  u  sends  a  message,  it  authenticates  it  with  all  the 
keys  in  and  every  receiving  party  v  verifies  the  authentica¬ 
tions  that  were  performed  with  the  keys  in  RunRy .  It  is  straight¬ 
forward  to  verify  that  the  resulting  schemes  are  as  secure  as  the 
single  source  schemes.  Note  that  the  (average)  communication 
and  computation  overheads  are  not  changed.  The  mapping  of 
users  to  subsets  can  be  done  with  a  public  ( w  +  2)-wise  inde¬ 
pendent  hash  function. 

Following,  we  present  a  better  method  which  supports  a  dy¬ 
namic  set  of  sources  and  has  the  following  properties: 

•  The  total  number  of  keys  is  as  in  schemes  for  a  single  source, 
but  every  party  can  send  authenticated  messages. 

•  The  scheme  does  not  require  the  set  of  sources  to  be  defined  in 
advance  or  to  contain  all  parties.  Rather,  it  allows  to  dynamically 
add  sources. 

•  The  scheme  distinguishes  between  the  set  of  sources  and  the 
set  of  receivers.  Only  coalitions  of  more  than  w  receivers  can 
send  false  authenticated  messages.  The  keys  of  sources  do  not 
help  such  coalitions.  This  property  is  especially  useful  if  re¬ 
ceivers  are  more  trusted  than  senders,  as  might  be  the  case  for 
example  if  the  receivers  are  network  routers. 

•  The  scheme  provides  a  computational  (rather  than  an  infor¬ 
mation  theoretic)  security  against  revealing  to  a  coalition  all  the 
keys  in  the  intersection  of  a  source  and  a  receiver’s  subsets. 

The  scheme  uses  a  family  of  pseudo-random  functions  {/*} 
(see  [20]  for  a  discussion  of  pseudo-random  functions).  It  is 
based  on  a  single  source  scheme  and  can  be  built  upon  the  ba¬ 
sic  scheme  we  described  in  Section  III-A  or  the  communication 
efficient  scheme  of  Section  III-B. 

Initialization:  The  scheme  uses  t primary  keys  (k\% . . . ,  kt), 
where  i  is  as  in  the  single  source  schemes  (t  =  0(w  log(l /</)). 
Each  key  k{  defines  a  pseudo-random  function  /*< . 

Receiver  Initialization:  Each  party  v  which  intends  to  re¬ 
ceive  messages  obtains  a  subset  Ry  of  primary  keys.  Every  pri¬ 
mary  key  ki  is  included  in  Rv  with  probability  l/(w  +  1). 

Source  Initialization:  Every  party  u  which  wishes  to 
send  messages  receives  a  set  of  secondary  keys  Su  = 
(fki  (u),  A2(u),  .  *  • ,  fkt{u)).  This  set  can  be  sent  any  time  after 
the  system  has  been  set-up,  and  the  identity  or  the  number  of 
sources  does  not  have  to  be  defined  in  advance. 

Message  Authentication:  When  a  party  u  sends  a  message 
M  it  authenticates  it  with  all  the  secondary  keys  in  Su.  That  is, 
V  k  6  Su  it  computes  and  attaches  a  MAC  of  M  with  k. 

Every  receiving  party  v  computes  all  the  secondary  keys  of  u 
with  primary  key  in  Ry.  Namely,  it  computes  the  set  fRv  (u)  = 
{/fc(w)|k  €  It  then  verifies  all  the  MACs  which  were 

computed  using  these  keys. 

The  number  of  keys  which  are  used  and  stored  is  as  in  the  sin¬ 
gle  source  scheme.  The  work  of  the  sources  is  as  in  the  previous 
schemes,  and  receivers  only  have  the  additional  task  of  evaluat¬ 
ing  /  to  compute  a  secondary  key  for  each  of  the  primary  keys 
in  their  subset 

A  very  useful  property  of  this  scheme  is  that  it  enables  a  dy¬ 
namic  set  of  sources.  New  parties  can  be  allowed  to  send  au¬ 
thenticated  messages  by  giving  them  a  corresponding  set  of  sec¬ 
ondary  keys.  Another  useful  property  of  the  scheme  is  that  the 
set  of  sources  can  be  separated  from  the  set  of  receivers,  and 


no  coalition  of  sources  can  break  the  security.  It  also  enables 
to  give  sources  dedicated  keys  for  authenticating  different  mes¬ 
sages.  An  attractive  application  of  these  properties  is  to  give  the 
source  which  is  designated  to  broadcast  at  time  T  the  set  of  sec¬ 
ondary  keys  fk{T ),  and  require  it  to  use  them  to  authenticate  its 
broadcast  at  that  time.  This  approach  ensures  that  sources  can 
only  send  information  to  the  group  in  their  designated  time  slots. 

D.  Signatures  vs.  MACs:  a  rough  performance  comparison 

Compared  to  the  performance  of  public  key  signatures,  our 
authentication  schemes  dramatically  reduce  the  running  time  of 
the  authenticator.  The  running  time  of  the  verifier  and  the  com¬ 
munication  overhead  are  of  the  same  order  as  public  key  signa¬ 
tures  (the  exact  comparison  depends  on  the  size  of  the  corrupt 
coalitions  against  which  the  schemes  operate). 

Consider  for  example  RSA  signatures  with  an  1024  bit  mod¬ 
ulus.  Recent  measurements  indicate  that  on  a  fast  machine 
(200MHz  power  pc)  a  signature  (authentication)  takes  = 
l/50s  and  verification  time  is  Ty  =  1/30, 000s.6  For  768-bit 
DSS  on  a  similar  platform  the  numbers  are  roughly  Ts  =  1/40 
and  Ty  =  1/70.  In  comparison,  an  application  of  the  compres¬ 
sion  function  of  MD5  takes  about  1 / 500, 000  of  a  second;  an  ap¬ 
plication  of  DES  takes  roughly  the  same  time.  Future  block  ci¬ 
phers  and  hash  functions  are  expected  to  be  considerably  faster. 

The  schemes  we  introduce  require  the  parties  to  apply  many 
MACs  with  different  keys  to  the  same  message.  Current  con¬ 
structions  of  MACs  achieve  both  a  hash  down  of  the  input  to  the 
required  output  size,  and  a  keyed  unpredictable  output.  For  the 
suggested  schemes  it  is  preferable  to  perform  a  single  hash  down 
of  the  message,  and  then  compute  MACs  of  the  hashed  down 
value7.  Regarding  HMAC  [19],  [4]  as  a  reference  MAC  func¬ 
tion,  this  implies  that  only  one  of  HMAC’s  two  nested  keyed 
applications  of  a  hash  function  should  be  used  (in  the  terms  of 
[4]  this  corresponds  to  defining  i  MACs  with  keys  fci , . . . ,  kt  as 
[NMACk^i  }f=i » where  the  key  A;  is  common  to  all  functions). 
Therefore  in  comparisons  to  public  key  operations  we  assume 
that  a  MAC  takes  a  single  application  of  a  compression  function 
of  the  hash  function  in  use  (say,  MD5),  or  equivalently  a  single 
application  of  a  block-cipher  such  as  DES. 

Furthermore,  we  believe  that  more  efficient  MACs  could  be 
designed  for  our  authentication  schemes.  In  particular,  these 
MAC  functions  would  make  use  of  the  fact  that  they  can  have 
a  single  bit  output,  and  would  have  small  amortized  complex¬ 
ity  (for  evaluations  of  the  function  on  the  same  input  and  many 
keys).  Authentication  schemes  based  on  such  functions  should 
be  considerably  more  efficient  than  schemes  based  on  HMAC. 

Table  I  compares  the  overhead  of  RSA  and  DSS  signatures  to 
the  overhead  of  the  suggested  authentication  schemes  with  some 
specific  parameters.  The  communication  overhead  of  the  basic 
and  improved  schemes  are  based  on  using  only  10  bits  out  of 
each  MAC. 

The  table  describes  the  number  of  authentications  and  veri¬ 
fications  that  can  be  performed  per  second,  the  communication 

6  The  numbers  here  are  for  highly  optimized  RSA  code  with  verification  ex¬ 
ponent  3.  Verification  using  standard  RSA  code  is  considerably  slower. 

7The  initial  hash  down  is  also  performed  for  public  key  signatures,  since  mes¬ 
sages  should  be  reduced  to  the  size  of  the  public  key  modulus.  Therefore  we 
omit  its  computation  time  from  the  running  time  overhead  of  our  schemes. 
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Auth. 

Ver. 

Comm. 

Source  Key 

Receiver  Key 

Units 

(ops/sec) 

(ops/sec) 

(bits) 

RSA*  1024  bits 

50 

30,000 

1024 

2048  bits 

1024  bits 

DSS,  768  bits 

70 

40 

1536 

1536  bits 

1536  bits 

Basic  scheme,  w  =  10,  q  =  10_ii 

2,650 

26,500 

1900 

190  MAC  keys 

19  MAC  keys 

Low  Comm.,  w  =  10,  q  —  10-d 

660 

6,600 

760 

760  MAC  keys 

76  MAC  keys 

Perfect  Sec.,  n  =  104,  q'  =  10“J 

200 

2000 

25,000 

2500  MAC  keys 

250  MAC  keys 

TABLE  I 

A  PERFORMANCE  COMPARISON  OF  AUTHENTICATION  SCHEMES. 


overhead  in  bits,  and  the  length  of  the  key  used  by  the  source 
and  the  receivers.  The  first  two  rows  are  for  RSA  and  DSS 
signatures.  The  third  row  provides  an  estimate  for  our  basic 
authentication  scheme,  providing  per-message  unforgeability  of 
q  =  10~3  against  coalitions  of  up  to  ten  corrupt  users.  Next  we 
present  the  performance  of  the  communication  efficient  variant, 
in  which  each  MAC  has  a  single  bit  output.  Last  is  the  perfor¬ 
mance  of  a  scheme  which  guarantees  that  no  coalition  knows  all 
the  keys  of  any  user  (its  overhead  seems  too  large  to  justify  its 
use). 

It  is  seen  that  the  signing  time  is  much  shorter  in  our  scheme 
than  with  public  key  signatures.  The  verification  time  is  compa¬ 
rable  to  (highly  optimized)  RSA  and  much  faster  than  DSS.8 

IV.  Dynamic  Secrecy  -  User  Revocation 

Secret  group  communication  can  be  achieved  by  encrypting 
messages  with  a  group  key.  This  raises  the  question  of  how  to 
add  or  remove  users  from  the  group.  When  a  new  member  joins 
the  group,  the  common  key  can  be  sent  to  the  new  member  using 
secure  unicast.  Alternatively,  if  the  previous  communications 
should  be  kept  secret  from  the  new  user,  a  new  common  key 
can  be  generated  and  sent  to  the  old  group  members  (encrypted 
with  the  old  common  key)  and  to  the  new  member  (using  secure 
unicast).  User  deletion  is  more  problematic.  Obviously,  it  is 
not  enough  to  just  ask  members  who  leave  the  group  to  delete 
their  group  key,  and  it  is  essential  to  change  the  key  with  which 
group  communication  is  encrypted  in  order  to  conceal  future 
communications  from  former  group  members.  This  problem  is 
known  as  user  revocation  or  blacklistings  and  is  particularly  im¬ 
portant  in  applications  like  pay-per-view  in  which  only  paying 
customers  should  be  allowed  to  receive  transmissions. 

We  survey  some  solutions  for  the  member  deletion  problem, 
describe  a  particularly  appealing  construction  from  [28],  [29] 
based  on  binary  trees,  and  present  an  improved  construction 
with  reduced  communication  overhead.  We  also  show  how  our 
construction  is  more  resistant  to  a  certain  kind  of  attack. 

A.  Some  User  Revocation  Schemes 

A  trivial  solution  for  the  member  revocation  problem  is  for 
each  group  member  to  share  a  individual  secret  key  with  a  cen¬ 
ter  which  controls  the  group.  When  a  member  is  deleted  from 

8  In  addition  note  that  if  public  key  signatures  are  used  for  authentication  then 
each  receiver  should  store  the  verification  keys  of  all  sources,  or  alternatively 
the  verification  keys  should  be  certified  by  a  certification  authority  and  then  the 
length  of  the  authentication  message  and  the  verification  times  are  doubled. 


the  group,  the  center  chooses  a  new  common  key  to  encrypt  fu¬ 
ture  multicast  messages,  and  sends  it  to  every  group  member, 
encrypted  with  the  respective  individual  secret  keys.  This  solu¬ 
tion  does  not  scale  up  well  since  a  group  of  n  members  requires 
a  key  renewal  message  with  n  —  1  new  keys. 

A  more  advanced  solution  was  suggested  by  Mittra  [22].  It 
divides  the  multicast  group  into  subgroups  which  are  arranged 
in  a  hierarchical  structure  and  each  has  a  special  group  con¬ 
troller.  The  user  revocation  overhead  is  linear  in  the  size  of 
a  subgroup.  However,  this  solution  introduces  group  controllers 
in  every  subgroup  which  form  many  possible  points  of  failure, 
both  for  availability  and  for  security. 

There  are  also  suggestions  to  use  public  key  technology, 
namely  generalized  Diffie-Hellman  constructions,  to  enable 
communication  efficient  group  re-keying  (e.g.  [27]).  However, 
for  a  group  of  n  members  these  suggestions  require  0(n)  expo¬ 
nentiations.  For  most  applications  this  overhead  is  far  too  high 
to  be  acceptable  in  the  near  future. 

A  totally  different  solution  was  suggested  by  Fiat  and  Naor 
[12]  and  was  motivated  by  pay-TV  applications.  It  enables  a 
single  source  to  transmit  to  a  dynamically  changing  subset  of 
legitimate  receivers  from  a  larger  group  of  users,  such  that  coali¬ 
tions  of  at  most  k  users  cannot  decrypt  the  transmissions  unless 
one  of  them  is  a  member  in  the  subset  of  legitimate  receivers. 
A  very  nice  feature  of  this  scheme  is  that  the  overhead  of  a  re- 
keying  message  does  not  depend  on  the  number  of  users  that  are 
removed  from  the  group.  The  communication  overhead  of  the 
scheme  is  0{k  log2  k  log(l/p)),  where  p  is  an  upper  bound  on 
the  probability  that  a  coalition  of  at  most  k  users  can  decrypt  a 
transmission  to  which  it  is  not  entitled.  The  scheme  also  requires 
each  user  to  store  0(log  fclog(l/p))  keys.  The  main  drawback 
in  applying  it  for  Internet  applications  is  that  the  security  is  only 
against  coalitions  of  up  to  k  users,  and  the  parameter  k  substan¬ 
tially  affects  the  overhead  of  the  scheme.  It  should  also  be  noted 
that  this  scheme  is  only  suitable  for  a  single  source  of  trans¬ 
mission,  but  this  obstacle  might  be  overcome  if  all  users  trust 
the  owner  of  the  group  and  all  communication  is  sent  through  a 
unicast  channel  to  this  owner  and  from  there  multicasted  to  the 
group  (as  is  the  case  for  example  in  CBT  routing). 

B.  A  Tree  Based  Scheme 

Tree  based  group  rekeying  schemes  were  suggested  by  Wall- 
ner  et  al.  [28]  (who  used  binary  trees),  and  independently  by 
Wong  et  al.  [29]  (who  consider  the  degree  of  the  nodes  of  the 
tree  as  a  parameter).  We  concentrate  on  the  scheme  of  [28]  since 
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it  requires  a  smaller  communication  overhead  per  user  revoca¬ 
tion.  This  scheme  applied  to  a  group  of  n  users  requires  each 
user  to  store  log  n+ 1  keys.  It  uses  a  message  with  2  log  n- 1  key 
encryptions  in  order  to  delete  a  user  and  generate  a  new  group 
key.  This  process  should  be  repeated  for  every  deleted  user.  The 
scheme  has  better  performance  than  the  Fiat-Naor  scheme  when 
the  number  of  deletions  is  not  too  big.  It  is  also  secure  against 
any  number  of  corrupt  users  (they  can  all  be  deleted  from  the 
group,  no  matter  how  many  they  are).  A  drawback  of  this 
scheme  is  that  if  a  user  misses  some  control  packet  relative  to  a 
user  deletion  operation  (e.g.,  if  it  temporarily  gets  disconnected 
from  the  network),  it  needs  to  either  ask  for  all  the  missed  con¬ 
trol  packets,  or  incur  in  a  communication  overhead  comparable 
to  a  user  addition  operation. 

We  now  describe  the  scheme  of  [28].  Let  u0> « •  •  >  «n-i  be  n 
members  of  a  multicast  group  (in  order  to  simplify  the  exposi¬ 
tion  we  assume  that  n  is  a  power  of  2).  They  all  share  a  group 
key  k  with  which  group  communication  is  encrypted.  There  is 
a  single  group  controller,  which  might  wish  at  some  stage  to 
delete  a  user  from  the  group  and  enable  the  other  members  to 
communicate  using  a  new  key  k\  unknown  to  the  deleted  user. 

The  group  is  initialized  as  follows.  Users  are  associated  to  the 
leaves  of  a  tree  of  height  logn  (see  Figure  1).  The  group  con¬ 
troller  associates  a  key  kv  to  every  node  of  the  tree,  and  sends  to 
each  user  (through  a  secure  channel)  the  keys  associated  to  the 
nodes  along  the  path  connecting  the  user  to  the  root.  For  exam¬ 
ple,  in  the  tree  of  Figure  1,  user  uo  receives  keys  fcooo,  ^oo ,  ko 
and  fc.  Notice  that  the  root  key  k  is  known  to  all  users  and  can 
be  used  to  encrypt  group  communications. 

In  order  to  remove  a  user  u  from  the  group,  the  group  con¬ 
troller  performs  the  following  operations.  For  all  nodes  v  along 
the  path  from  u  to  the  root,  a  new  key  is  generated.  New 
keys  are  encrypted  as  follows.  Key  is  encrypted  with 
key  where  p(u)  and  s(it)  denote  respectively  the  par¬ 
ent  and  sibling  of  u.  For  any  other  node  v  along  the  path 
from  u  to  the  root  (excluded),  key  is  encrypted  with 
keys  fc'  and  k8(v).  Ail  encryptions  are  sent  to  the  users.  For 
example,  in  order  to  remove  user  uo  from  the  tree  of  Fig¬ 
ure  1  the  following  encryptions  are  transmitted  (see  Figure  2): 
^*001  (feoo)>^A:',0(feo)>  (*o)>  (*')•  It  is  easy  to 

verify  that  each  user  can  decrypt  only  the  keys  it  is  entitled  to 
receive. 

C.  The  Improved  Scheme 

The  improved  scheme  reduces  the  communication  overhead 
of  [28]  by  a  factor  of  two,  from  2  logn  to  only  logn.  The 
initialization  of  the  scheme  is  the  same  as  in  [28].  We  now  de¬ 
scribe  the  user  revocation  procedure.  Let  G  be  pseudo-random 
generator  which  doubles  the  size  of  its  input  [5].  Denote  by 
L(x),  H(i)  the  left  and  right  halves  of  the  output  of  G(x ),  i.e., 
O(x)  =  L(x)R(x)  where  |L(a:)|  =  |/2(:c)|  =  |s|.  To  remove  a 
user  u,  the  group  controller  associates  a  value  rv  to  every  node  v 
along  the  path  from  u  to  the  root  as  follows:  It  chooses  rp<u)  =  r 
at  random  and  sets  rp{v)  =  R(rv)  =  Rl«1-M(r)  for  all  other  v 
(where  p(v)  denotes  the  parent  of  v).  The  new  keys  are  defined 
by  k'v  =  £(rv)  =  £(<Rlt*l~M'“1(r)).  Notice  that  from  rv,  one 
can  easily  compute  all  keys  k'v,  ^(v)j  k'p{p{v))  up  to  the  root  key 


Fig.  1 .  The  tree  key  data  structure  (the  keys  of  uo  are  encircled). 


Fig.  2.  Key  revocation  in  the  basic  scheme. 


k*.  Finally  each  value  rp(v j  is  encrypted  with  key  (where 
s(u)  denotes  the  sibling  of  u)  and  sent  to  all  users.  For  example, 
in  order  to  remove  user  uo  from  the  tree  of  Figure  1 ,  we  send  en¬ 
cryptions  Efcooi  (r),  Ekoi  (R(r))yEkl  ( R(R(r ))).  One  can  easily 
verify  that,  under  the  assumption  that  G  is  a  cryptographically 
strong  pseudo-random  generator,  each  user  can  compute  from 
the  encryptions  all  and  only  the  keys  it  is  entitled  to  receive. 

Advantages  of  the  new  scheme:  This  construction  halves 
the  communication  overhead  of  the  basic  scheme  to  only  logn, 
and  its  security  can  be  rigorously  proven.  It  has  an  additional 
advantage:  In  the  scheme  of  Wallner  et  al  the  group  controller 
chooses  the  group  key  (the  root  key),  whereas  is  our  construc¬ 
tion  this  key  is  the  output  of  a  pseudo-random  generator.  Sup- 

k’ =  L(R{R(r))) 


Fig.  3.  Key  revocation  in  the  improved  scheme. 
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pose  that  there  is  an  adversary  which  can  break  encryptions 
performed  with  a  subset  of  the  key  space  (for  example  keys 
in  which  certain  bits  have  a  linear  dependency),  and  further¬ 
more  that  this  adversary  has  gained  temporary  control  over  the 
group  controller  (e.g.  when  the  controller  was  manufactured). 
Then  if  the  scheme  of  [28]  is  used,  the  adversary  might  cor¬ 
rupt  the  method  by  which  the  group  controller  generates  keys 
in  such  a  way  that  the  root  key  would  always  be  chosen  from 
the  “weak”  subspace.  However,  if  our  scheme  is  used,  and  the 
pseudo-random  generator  G(x)  =  L(x)R(x)  is  cryptographi¬ 
cally  strong,  then  it  will  be  hard  to  find  values  r  such  that  the 
root  key  k  =  L(R(R( ■  *  *  (r)  •  *  *)))  is  weak. 

Independently,  McGrew  and  Sherman  [21]  have  presented 
a  tree  based  rekeying  scheme  which  has  the  same  overhead 
as  ours.  However,  the  security  of  their  scheme  is  based  on 
non-standard  cryptographic  assumptions  and  is  not  rigorously 
proven.  In  comparison,  the  security  of  our  scheme  can  be  rigor¬ 
ously  proven  based  on  the  widely  used  assumption  of  the  exis¬ 
tence  of  pseudo-random  generators  [5]. 
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Abstract*  We  give  cryptographic  schemes  that  help  trace  the  source 
of  leaks  when  sensitive  or  proprietary  data  is  made  available  to  a  large 
set  of  parties.  This  is  particularly  important  for  broadcast  and  database 
access  systems,  where  the  data  should  be  accessible  only  to  authorized 
users.  Such  schemes  are  very  relevant  in  the  context  of  pay  television,  and 
easily  combine  with  and  complement  the  Broadcast  Encryption  schemes 
of  [6]. 


1  Introduction 

If  only  one  person  is  told  about  some  secret  ,  and  this  next  appears  on  the  evening 
news,  then  the  guilty  party  is  evident.  A  more  complex  situation  arises  if  the 
set  of  people  that  have  access  to  the  secret  is  large.  The  problem  of  determining 
guilt  or  innocence  is  (mathematically)  insurmountable  if  all  people  get.  the  exact 
same  data  and  one  of  them  behaves  treacherously  and  reveals  the  secret. 

Any  data  that  is  to  be  available  to  some  while  it  should  not  be  available  to 
others  can  obviously  be  protected  by  encryption.  The  data  supplier  may  give  au¬ 
thorized  parties  cryptographic  keys  allowing  them  to  decrypt  the  data.  This  does 
not  solve  the  problem  above  because  it  does  not  prevent  one  of  those  authorized 
to  view  the  message  (say,  Alice)  from  transferring  the  cleartext  message  to  some 
unauthorized  party  (say,  Bob).  Once  this  is  done  then  there  is  no  (cryptographic) 
means  to  trace  the  source  of  the  leak.  We  call  all  such  unauthorized  access  to 
data  piracy.  The  traitor  or  traitors  is  the  (set  of)  authorized  user(s)  who  allow 
other,  non-authorized  parties,  to  obtain  the  data.  These  non-authorized  parties 
are  called  pirate  users. 

In  many  interesting  cases  it  is  somewhat  ineffective  piracy  if  the  relevant 
cleartext  messages  must  be  transmitted  by  the  “traitor’'  to  the  “enemy”.  Typical 
cases  where  this  is  so  include 

-  Pay-per-view  or  subscription  television  broadcasts.  It  is  simply  too  expensive 
and  risky  to  start  a  pirate  broadcast  station. 

-  CD  ROM  distribution  of  data  where  a  surcharge  is  charged  for  different  parts 
of  the  data.  The  cleartext  data  can  only  be  distributed  on  a  similar  storage 
device. 

-  Online  databases,  freely  accessible  (say  on  the  internet)  where  a  charge  may 
be  levied  for  access  to  all  or  certain  records. 


Copyright  (c)  1998,  Springer-Verlag 


258 


In  all  these  cases,  transmitting  the  cleartext  from  a  traitor,  Alice,  to  an  pirate- 
user.  Bob,  is  either  irrelevant  or  rather  expensive.  As  piracy  in  all  these  cases  is  a 
criminal  commercial  enterprise  the  risk/benefit  ratio  becomes  very  unattractive. 
These  three  examples  can  be  considered  generic  examples  covering  a  wide  range 
of  data  services  offered. 

Our  contribution  in  this  paper  may  be  viewed  in  the  following  manner:  Con¬ 
sider  a  ciphertext  that  may  be  decrypted  by  a  large  set  of  parties,  but  each  and 

!VeIy^rty  18  aSS,Racd  a  dlfferent  Persona/  key  used  for  decrypting  the  cipher- 
text.  (We  use  the  term  personal  key  rather  than  private  key  to  avoid  confusion 
with  public  key  terminology).  Should  the  personal  key  be  discovered  (by  taking 
apart  a  television  pirate  decoder  or  by  counter-espionage),  the  traitor  will  be 
identified. 

We  note  that  in  fact,  our  schemes  have  the  very  desirable  property  that  the 
identity  of  the  traitor  can  be  established  by  considering  the  pirate  decryption 
process  as  a  black  box.  It.  suffices  to  capture  one  pirate  decoder  and  it’s  behavior 
will  identify  the  traitor,  there  is  no  need  to  “break  it  open”  or  read  any  data 
stored  inside.  We  use  the  term  pirate  decoder  to  represent  the  pirate  decryption 

process,  this  may  or  may  not  be  a  physical  box,  this  may  simply  be  be  some 
code  on  a  computer. 

Clearly,  a  possible  solution  is  to  encrypt  the  data  separately  under  different 
persona  keys.  Ihis  means  that  the  total  length  of  the  ciphertext  is  at  least  n 
times  the  length  of  the  cleartext,  where  n  is  the  number  of  authorized  parties. 
I  his  is  clearly  impossible  in  any  broadcast  environment.  This  is  also  very  prob¬ 
lematic  in  the  context  of  CD  ROM  distributed  databases  because  this  means 
at  every  CD  ROM  must  be  different.  An  encrypted  online  database,  freely 
accessible  as  above,  must  store  an  individually  encrypted  copy  of  the  database 
tor  each  and  every  authorized  user. 

The  underlying  security  assumption  0r  our  schemes  is  either  information 
theoretm  security  (where  the  length  of  the  personal  keys  grows  with  the  length 
of  the  messages  to  be  transmitted)  or  it  may  be  based  on  the  security  of  any 
symmetric  scheme  of  your  choice.  In  both  cases,  security  depends  on  a  scheme 
parameter  k,  the  largest  group  of  colliding  traitors. 

In  practice,  today  it  is  often  considered  sufficient  to  prevent  piracy  by  supply¬ 
ing  the  authorized  parties  with  so-called  secure  hardware  solutions  that  are  de¬ 
signed  to  prevent  interference  and  access  to  enclosed  cryptographic  keys  (smart- 
cards  and  their  like).  Our  schemes  do  not  require  any  such  assumption,  they 
obtain  their  claimed  security  without  any  secure  hardware  requirements.  Should 
such  devices  be  used  to  store  the  keys  they  will  undoubtedly  make  the  attack 
more  expensive,  but.  this  is  not  a  requirement. 

Fighting  piracy  in  general  has  the  following  components; 

1.  Identify  that  piracy  is  going  on  and  prevent  the  transmittal  of  information 

to  pirate  users,  while  harming  no  legitimate  users. 

2.  Take  legal  measures  against  the  source  of  such  piracy,  supply  legal  evidence 

of  the.  pirate  identity.  . 
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Any  solution  to  fighting  piracy  must  be  considered  in  light  of  the  following 
performance  parameters: 

(a)  What,  are  the  memory  arid  computation  requirements  per  authorized  user? 

(b)  What  are  the  memory  and  computation  requirements  for  the  data  supplier? 

(c)  What  is  the  data  redundancy  overhead?  This  is  measured  in  multiples  of 
the  the  cryptographic  security  parameter  and  refers  to  the  communications 
overhead  (in  broadcast  or  online  systems)  or  the  additional  “wasted”  storage 
in  CD  ROM  type  systems. 

Consider  a  pirate  user  who  has  already  obtained  all  keys  required  to  read  a 
CD  ROM  in  it’s  entirety.  Clearly,  there  is  little  one  can  do  technically  to  prevent 
her  from  continuing  to  use  the  CD  ROM.  The  situation  is  somewhat  different  if 
the  system  requires  some  action  on  behalf  of  the  data  supplier,  e.g.,  television 
broadcast  or  online  database. 

The  broadcast  encryption  scheme  of  Fiat  and  Naor  [6]  deals  with  disabling 
active  pirate  users  very  efficiently.  These  schemes  allow  one  to  broadcast  mes¬ 
sages  to  any  dynamic  subset  of  the  user  set,  this  is  specifically  suitable  for  pay- 
per-view  TV  applications  but  also  implies  the  piracy  protection  above.  These 
schemes  require  a  single  short  transmission  to  disable  all  pirate  decoders  if  they 
were  manufactured  via  a  collaborative  effort  of  no  more  than  k  traitors. 

The  number  of  traitors  above,  fc,  is  a  parameter  of  the  broadcast  encryption 
schemes.  While  this  may  not  be  evident  at  first,  the  same  scheme  could  be  used 
by  any  online  database  supplier  to  kill  off  illegitimate  access  simply  by  telling 
users  who  log  on  what  users  are  currently  blacklisted. 

The  goal  of  this  paper  is  to  deal  traitor  tracing  (item  2  above) ,  i.e.,  to  identify 
the  source  of  the  problem  and  to  deal  with  it  via  legal  or  extra-legal  means.  Our 
solution,  called  traitor  tracing,  is  valid  for  all  examples  cited  above,  broadcast, 
online,  and  CD  ROM  type  systems. 

We  devise  fc- resilient  traceability  schemes  with  the  following  properties: 

1.  Either  the  cleartext  information  itself  is  continuously  transmitted  to  the 
enemy  by  a  traitor,  or 

2.  Any  captured  pirate  decoder  will  correctly  identify  a  traitor  and  will  protect 
the  innocent  even  if  up  to  k  traitors  combine  and  collude. 

It  would  make  sense  to  have  both  broadcast  encryption  and  traitor  tracing 
schemes  available,  at  different  security  levels.  The  costs  of  such  schemes  are  mea¬ 
sured  in  the  memory  requirements  at  the  user  end  and  in  the  total  transmission 
length  required.  In  practice  one  would  want  a  broadcast  encryption  scheme  with 
a  different  security  level  (measured  in  the  numbers  of  traitors  required  to  disable 
the  scheme).  Fortunately,  both  types  of  scheme,  at  arbitrary  security  levels,  can 
be  trivially  combined  simply  by  XOR’ing  the  results. 

We  deal  with  schemes  of  the  following  general  form:  The  data  supplier  gener¬ 
ates  a  base  set  R  of  r  random  keys  and  assigns  subsets  of  these  keys  to  users,  m 
keys  per  user  (these  parameters  will  be  specified  later).  These  m  keys  jointly  form 
the  user  personal  key.  Note  that  different  personal  keys  may  have  a  nonempty 
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intersection.  We  denote  the  personal  key  for  user  u  by  P(u)y  this  is  a  set  of  keys 
over  the  base  set  R. 

A  traitor  tracing  message  consists  of  many  pairs  of  (enabling  block ,  cipher 
b/ock).  The  cipher  block  is  the  symmetric  encryption  of  the  actual  data  (say  a  few 
seconds  of  a  video  clip),  under  some  secret  random  key  S.  Alternately,  it  could 
simply  be  the  XOR  of  the  message  with  and  we  would  get  an  information 
theoretic  secure  version  of  the  scheme.  The  enabling  block  allows  authorized 
users  to  obtain  S.  The  enabling  block  consists  of  encrypted  values  under  some 
or  all  of  the  r  keys  at  the  data  supplier.  Every  u.ser  will  be  able  to  compute  S  by 
decrypting  the  values  for  which  he  has  keys  and  then  computing  the  actual  key 
from  these  values.  The  computation  on  the  user  end,  for  all  schemes  we  present, 
is  simply  the  exclusive  or  of  all  values  the  user  has  been  able  to  decrypt. 

Figures  1  and  2  describe  the  general  nature  of  our  traitor  tracing  schemes. 

Traitors  may  conspire  and  give  an  unauthorized  user  (or  users)  a  subset  of 
their  keys  so  that  the  unauthorized  user  will  also  be  able  to  compute  the  real 
message  key  from  the  values  he  has  been  able  to  decrypt.  The  goal  of  the  system 
designer  is  to  assign  keys  to  the  users  such  that  when  a  pirate  decoder  is  captured 
and  the  keys  it  possesses  are  examined,  it  should  be  possible  to  detect  at  least 
one  traitor,  subject  to  the  limitation  that  the  number  of  traitors  of  is  at  most 
k.  (We  cannot  hope  to  detect  all  traitors  as  one  traitor  may  simply  provide  his 
personal  key  and  others  may  provide  nothing). 


UUT  1 
per  tonal  key 


Uteri 

penonalkey 


Uter  n 
pcrtmalkey 


Fig.  1. 
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Fig.  2. 


We  remark  that  in  many  cases  it  is  preferable  to  predetermine  a  fixed  number 
of  users  »,  and  to  assign  them  personal  keys,  even  if  the  actual  number  of  users 
is  smaller.  Later  users  who  join  the  system  by  purchasing  a  subscription  to  a 
television  station,  online  database,  or  CD  ROM  access  privilege  are  assigned 
persona!  keys  from  those  preinstalled.  This  is  especially  important  in  the  case  of 
data  distributed  on  CD  ROM. 

1,1  An  Example 

Using  the  1-level  secret  scheme  described  hereinafter,  allocating  5%  of  a  com¬ 
pressed  MPEG  II  digital  video  channel  to  the  traitor  tracing  scheme  allows  us 
to  change  keys  every  minute  or  so  (a  new  enabling  block  every  minute). 

The  traitor  tracing  scheme  is  resilient  to  k  =  32  traitors,  with  probability 
and  can  accommodate  up  to  1,000,000,000  authorized  users.  The  total 
number  of  keys  stored  by  the  data  supplier  (the  television  broadcaster)  is  219, 
the  personal  key  of  every  user  consists  of  213  keys.  These  parameters  are  overly 
pessimistic  because  they  are  derived  from  the  general  theorem  concerning  the 
scheme  using  the  ChernofT  bound. 

In  practice,  there  is  no  real  need  to  change  keys  every  minute,  even  changing 
keys  once  every  hour  will  make  any  pirate  broadcaster  give  up  in  despair. 

2  Definitions 

For  messages  generated  by  a  data  supplier  for  a  set  of  n  users,  we  define  three 
elements  that,  jointly  constitute  a  traceability  scheme: 
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-  A  user  initialization  scheme ,  used  by  the  data  supplier  to  add  new  users. 

I  he  data  supplier  supplies  user  u,  with  her  personal  key,  in  our  case  this 
consists  of  a  set  P(u,)  containing  decryption  keys. 

A  decryption  scheme ,  used  by  every  user  to  decrypt  messages  generated 
by  the  data  supplier.  In  our  schemes,  the  messages  are  decrypted  block  by 
block  where  every  block  decryption  consists  of  a  preliminary  decryption  of 
encrypted  keys  in  the  enabling  block,  combining  the  results  to  obtain  a 
common  key,  followed  by  a  decryption  of  the  cipher  block. 

-  A  traitor  tracing  algorithm ,  used  upon  confiscation  of  a  pirate  decoder,  to 
determine  the  identity  of  a  traitor.  We  assume  below  that  the  contents  of  a 
pirate  decoder  can  be  viewed  by  the  traitor  tracing  algorithm. 

We  distinguish  between  circumstances  where  the  decryption  schemes  used  by 
all  users  are  in  the  public  domain,  whereas  the  decryption  keys  themselves  are 
kept  secret,  called  open  schemesi  versus  the  case  where  the  actual  decryption 
scheme  as  well  as  the  keys  are  kept  secret,  called  ,« secret  schemes. 

The  goal  of  an  adversary  is  to  construct  a  pirate  decoder  that  allows  de¬ 
cryption  and  prevents  the  guilty  from  being  identified.  In  particular,  one  way  to 
ensure  that  the  guilty  are  safe  is  to  try  to  incriminate  someone  else.  Clearly,  the 
adversaries  task  is  no  harder  with  an  open  scheme  compared  to  a  secret  scheme. 
On  the  other  hand,  secret  schemes  pose  additional  security  requirements  at  the 
data  supplier  cite  and  the  correctness  of  the  traitor  identity  may  be  based  on 
probabilistic  arguments,  which  may  be  somewhat  less  convincing  in  a  court  of 
law. 

We  present  efficient  schemes  of  both  types,  and  our  constructions  give  better 
results  for  secret  schemes.  Il  is  clearly  advantageous  to  use  secret  schemes  in 
practice,  and  any  real  implementation  will  do  so. 

To  simplify  the  definitions  below  we  will  assume  that  it  is  impossible  to  guess 
a  secret  key.  The  probability  of  guessing  a  secret  key  is  exponentially  small  in 
the  length  of  the  key,  and  thus  we  will  ignore  this  question  in  the  definitions 
below.  An  alternative  would  be  to  talk  about  probability  differences  rather  than 
absolute  probabilities,  this  is  done  in  [6]  and  analogous  definitions  could  be  used 
here. 

Definition  1.  An  n  user  open  traceability  scheme  is  called  it  resilient  if  for  every 
coalition  of  at.  most  k  traitors  the  following  holds:  Suppose  the  coalition  uses  the 
information  its  members  got  in  the  initialization  phase  to  construct  a  pirate 
decoder.  If  this  decoder  is  capable  of  applying  the  decryption  scheme,  then  the 
traitor  tracing  algorithm  will  correctly  identify  one  of  the  coalition  members. 

Definition.  An  n  user  secret  traceability  scheme  is  called  (p,  k)-resilient  if  for 
all  but  at  most  p  of  the  (£)  coalition  of  k  traitors  the  following  holds:  Suppose 
the  coalition  uses  the  information  its  members  got  in  the  initialization  phase  to 
construct  a  pirate  decoder.  If  this  decoder  is  capable  of  applying  the  decryp¬ 
tion  scheme,  then  the  traitor  tracing  algorithm  will  identify  one  of  the  coalition 
members  with  probability  at  least  1  -  p. 
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3  Construction  of  Traceability  Schemes 

in  this  section  we  describe  three  constructions  of  ^-resilient  traceability  schemes. 
All  these  schemes  are  based  on  the  use  of  hash  functions*  combined  with  my 
private  key  cryptosystem.  (For  more  information  on  hash  functions  and  their 
applications,  see  [7,  3,  9,  5].)  The  basic  use  of  hash  functions  is  to  assign  de¬ 
cryption  keys  to  authorized  users  in  a  manner  which  prevents  any  coalition  of 
traitors  from  combining  keys  taken  from  the  personal  keys  of  its  members  into 
a  set  of  keys  that  allows  decryption  yet  is  “close”  to  the  personal  key  of  any 
innocent  user. 

The  first  scheme  is  the  simplest  one.  It  is  an  open  scheme,  based  on  “one 
level”  hash  functions.  Each  hash  function  maps  the  n  users  into  a  set  of  2 k2  de¬ 
cryption  keys.  The  keys  themselves  are  kept  secret,  but  the  mapping  (which  user 
is  mapped  to  what  key)  is  publicly  known.  This  is  a  simple  scheme,  but  its  per¬ 
formance  can  be  improved  upon:  Every  user  personal  key  consists  of  0(k2  log  n) 
decryption  keys,  and  the  enabling  block  consists  of  0(fc4logn)  encrypted  keys. 

The  second  scheme  is  an  open  “two  level”  scheme.  Here,  a  set  of  first  level 
hash  functions  map  the  n  users  into  a  set  of  size  k .  each  function  thereby  induces 
a  partition  of  the  n  users  to  h  subsets.  Each  of  these  subsets  is  mapped  separately 
by  “second  level”  hash  functions  into  log2  k  decryption  keys.  This  scheme  re¬ 
quires  0(k2  log2  k  log  n)  keys  per  user,  and  an  enabling  block  of  0(k3  log4  k  log  n) 
encrypted  keys. 

The  third  scheme  is  a  “one  level”  secret  scheme.  Here,  we  assume  that  the 
hash  functions,  as  well  as  the  decryption  keys,  are  kept  secret.  There  is  a  positive 
probability  p  (0  <  p  <  1)  that  the  adversary  will  be  able  to  produce  pirate 
decoders  which  prevent  the  identification  of  any  traitor. 

However,  even  if  the  keys  known  to  the  k  collaborators  enable  the  construc¬ 
tion  of  such  “wrongly  incriminating”  pirate  decoders,  choosing  such  set  is  highly 
improbable.  Even  if  this  unlikely  event  occurs,  the  adversary  will  not  know  that 
this  is  the  case. 

Being  a  secret  scheme  implies  that  the  adversary  does  not  know  what  keys 
corresponds  to  any  specific  user.  The  personal  key  consists  of  0(k  log(n/p)) 
decryption  keys,  and  has  0(fc2  log(n/p))  encrypted  keys  per  enabling  block. 

All  schemes  are  constructed  by  choosing  hash  values  at  random,  and  using 
probabilistic  arguments  to  assert  that  the  desired  properties  hold  with  over¬ 
whelming  probability.  Therefore,  these  schemes  are  not  constructive.  However, 
the  properties  of  the  simplest  scheme  can  be  verified. 

3.1  A  Simple  Scheme 

Let  k  be  an  upper  bound  on  the  number  of  traitors.  Every  enabling  block  consists 
of  v  encryptions,  and  rn  denotes  the  number  of  keys  comprising  a  user  personal 
key. 

We  first  deal  with  the  case  k  =  1.  The  data  supplier  generates  r  —  2  log  n  keys 
s°i}  sit  •  •  •  >  slog  n  *  siog  n  *  The  personal  key  for  user  i  is  the  set  of  m  =  log  n 
keys  s\l ,  sj2 , . . . ,  ,  where  6t-  is  the  zt-h  bit  in  the  ID  of  u. 
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To  encrypt,  a  secret  s ,  the  data  supplier  splits  it  into  log n  secrets  «i ,  .... 

siog»,  i-e.,  the  data  supplier  chooses  random  *i,  s2, . . s,ogn  such  that  r/  is  the 
bitwise  XOR,  of  the  s,'s  (its  j-th  bit  equals  s(i>  =  XOR^s^).  '1’he  value  s,-  is 
encrypted  using  keys  sf  and  *1  and  both  encryptions  are  added  to  the  enabling 
block.  Every  user  «  can  reconstruct  all  the  s/s  and  hence  can  decrypt  s.  On 
the  other  haud,  any  pirate  decoder  must,  contain  a  key  for  every  1  <  i  <  log n 
(otherwise  s(  would  remain  unknown  and  consequently  s  could  not  be~obtained). 

Thus,  given  that  at  most  one  traitor  is  involved,  the  keys  stored  in  the  pirate 
decoder  uniquely  identify  the  traitor. 

When  dealing  with  larger  coalition,  the  idea  is  to  generalize  the  above  scheme. 
Instead  of  one  bit  per  index  we  will  have  larger  domains  (and  have  a  key  for  every 
element  in  the  domain).  Wo.  will  also  split  s  into  more  than  logn  parts  and  have 
appropriately  more  indices  or  hash  functions.  The  major  difficulty  we  encounter 
is  in  the  procedure  for  detecting  traitors.  Since,  unlike  the  case  k  =  1,  keys  may 
be  mixed  from  several  members  of  the  coalition,  we  must  make  sure  that  the 
two  users  are  not  only  different  on  some  indices,  but  are  different  in  almost  all 
indices.  A  detailed  description  of  the  scheme  is  given  below. 

Initialization:  A  set  of  (  “first  level'’  hash  functions  hi, ho, . . . , hi  is  chosen 

at  random  by  the  data  supplier.  Each  hash  function  ft,  maps"{l . «}  into  an 

independent  set  of  2k'2  random  keys,  5,  =  {st,i,s0> . . . , Si  The  personal 
key  for  user  u  is  the  set  P(u)  =  f  keys  (hi(u),  h2(«), ....  ht(u)). 

Distributing  a  hey:  For  each  i  (i  =  1,2, . .  ,,f.)  the  data  supplier  encrypts  a 
key  s,  under  each  of  the  2k-  keys  in  St.  The  final  key  s  is  the  bitwise  XOR  of 
the  si’s  (its  j-th  bit  equals  «0>  =  XOR'^/').  Each  authorized  user  has  one 
key  from  so  he  can  decrypt  every  s*,  and  thus  compute  s. 

Parameters:  The  memory  required  per  user  is  m  =  i  keys.  An  enabling  block 
to  encode  a  secret  value  s  consists  of  =  2 k'l(.  key  encryptions. 

Fraud:  The  k  traitors  can  get  together  and  combine  their  personal  keys. 
They  may  choose  one  key  from  every  set  Si  (i  =  1,2, .  ..,*).  These  l  keys  are 
put  together  in  a  pirate  decoder.  This  set  of  keys  F ,  enables  the  purchaser  of 
such  a  decoder  to  decrypt,  every  s,,  and  thus  compute  s. 

Detection  of  Traitors:  Upon  confiscation  of  a  pirate  decoder,  the  set  of  keys 
in  it,  F,  is  exposed.  The  set  F  must  contain  at  least  I  keys  (at  least  one  key 
per  set  Si).  Denote  the  key  with  minimum  index  in  5,  by  /<  £  S,.  For  each  », 
the  users  in  /»,•  (/,)  are  identified  and  marked.  The  user  with  largest  number  of 
marks  is  exposed. 

Goal:  We  want  to  show  that,  for  all  coalitions  of  size  k,  the  probability  of 
exposing  a  user  who  is  not  a  traitor  is  negligible. 

Clearly,  at  least  one  of  the  traitors  contributes  at  least  £/fc  of  the  keys  to 
the  pirate  decoder  (we  ignore  duplicate  keys  from  the  same  S{).  We  want  to 
show  that  the  probability  (over  all  choices  of  hash  functions)  that,  a  good  user 
is  marked  t/k  times  is  negligible.  Consider  a  specific,  user,  say  ],  and  a  specific 
coalition  T  of  k  traitors  (which  does  not  include  1).  As  hash  functions  are  chosen 
at  random,  the  value  a;  =  /,(1)  is  uniformly  distributed  in  S, .  The  coalition  gets 
at  most  k  keys  in  S).  The  probability  that.  o(  is  among  these  keys  is  at  most  1/2*. 
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Let  X \  be  a  zero-one  random  variable,  where  Xj  =  1  if  «j  €  /?t{7').  The  mean 
value  of  Xi  is  tf/2fc.  We  use  the  following  version  of  Chernoff  bound  (see  [2], 
Theorem  A.  12)  to  bound  the  probability  that  ^  Let  Ar  i , . .  ,,AV 

be  mutually  independent  random  variables,  with 


/>,.(*,  =  l)  =  p 

P,*(A'j  =  0)  =  1  -  p 


Then,  for  all  /?  >  1 


<(ir) 


In  our  case,  substituting  p  —  1  /2fc  and  3  =  2,  we  have 


Pr 


<2-'/4fc. 


In  order  to  overcome  all  (£)  coalitions  and  ail  n  choices  of  users,  we  choose  t 
satisfying 


n  • 


•  2~f/4*  <  1, 


namely  t  >  4fc2  logn.  With  this  parameter,  there  is  a  choice  of  t  hash  functions 
such  that  for  every  coalition  and  every  authorized  user  not  in  the  coalition,  the 
user  is  not  incriminated  by  the  tracing  algorithm.  We  summarize  these  results 
in  the  next  theorem: 


Theorem  3.  There  is  an  open  k -resilient  traceability  scheme,  where  a  user  per¬ 
sonal  key  consists  of  m  =  4k2  logn  decryption  keys,  and  an  enabling  block  con¬ 
sists  of  r  zz  8&4logn  key  encryptions. 

Explicit  Constructions:  The  discussion  above  shows  the  existence  of  open  k 
resilient  traceability  schemes,  and  does  provide  us  with  a  randomized  method  for 
constructing  the  scheme  that  works  with  high  probability.  It  does  not,  however, 
suggests  an  explicit  construction.  Note  however  that  a  given  construction  can 
be  verified  quite  efficiently.  The  idea  is  to  examine  all  the  pairs  of  elements  u,  v 
and  check  the  number  of  function  hi  such  that  hi(v)  zi  hj(u).  If  this  number  is 
smaller  than  tjk 2  than  we  can  conclude  that  no  coalition  T  of  at  most  k  elements 
“covers”  more  than  a  l/k  fraction  of  the  keys  of  u  and  hence  cannot  incriminate 
u . 

By  considering  pairwise  differences,  we  can  phrase  the  construction  problem 
as  a  problem  in  coding  theory  (see  [8]  for  more  information):  construct  a  code 
with  n  codewords  over  an  alphabet  of  size  2k2  of  length  ?.  such  that  the  distance 
between  every  two  codewords  is  at  least  t-  f/fc2.  The  goal  is  construct  such 
a  code  with  as  small  l  as  possible.  There  are  no  known  explicit  construction 
that  match  the  probabilistic  bound.  For  the  best  known  construction  see  [1] 
and  references  therein.  For  small  k  the  constructions  there  yields  a  scheme  with 
m  €  0(k6  log  n)  and  r  6  0(ks  log  n). 
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3.2  Ail  Open  Two  Level  Scheme 

The  “two  level”  traceability  scheme,  described  in  this  subsection,  more  compli¬ 
cated  than  the  simple  scheme,  but  it  saves  about  a  factor  of  A  in  the  broadcast 
overhead. 

Theorem  4.  There  is  an  open  k-resilienl  traceability  scheme,  tvhere  a  user  per¬ 
sonal  key  consists  of  m  =  log2  Hogn  decryption  keys,  and  an  enabling  block 
consists  of  r  =  4 fc3  log**  k  log  n  key  encryptions. 

Proof.  We  describe  the  system,  step  by  step.  As  in  the  one-level  scheme,  the 
proof  is  existential.  We  do  not  know  however  how  to  verify  efficiently  that  a 
given  scheme  is  “good”. 

Initialization:  A  set  of  l  “first  level”  hash  functions  hi ,  A2j  *  M  ht,  each  map- 
ping  to  is  chosen  at  random.  For  each  i  [i  —  1,2,  ...,£) 

and  each  element  a  in  {1, a  set  of  d  “second  level”  hash  functions 
-  • , y,,a,d  is  chosen  at  random.  Bach  second  level  function  Hi.a.j  maps  the 
users  in  hj  (a)  C  { 1, . . . ,  n}  into  a  set  of  4 log2  A  random  independent  keys  (the 
ranges  of  different,  functions  are  independent). 

Each  user  n  (E  { l, . . .  ,u)  receives  f  ■  d  keys 

tfl.Mul.l(tt) . </l.A,(u),rf(«) 


9t. M«u(“)  ),</(«) 

Distributing  a  hey:  The  data  supplier  chooses  at  random  t  independent  keys 

»i . *r-  The  final  key  is  s  =  BITWISE  -  XORli=l(st). 

For  each  *(*=1,2 . <),  «  (a  =  1 . *).  and  j  (j  =  1 . rf),  let  be 

an  independent,  random  key,  satisfying 

*>  =  BITWISE  -  XOR(sixu. .  ,,sixi) 

=  BITWISE  ~XOR(si:iA . tiXi) 


—  BITWISE  -  XOR{si'k.t . *,.M) 

The  key  encrypted  under  each  of  the  4 log2  A-  keys  in  the  range  of  the 

function  is  added  to  the  enabling  block. 

User  u  possesses  the  d  keys  .V;,M„,,i(u). . . .  ,giMuU(u)  and  so  he  is  capa¬ 
ble  of  decoding  Si>,(U)il,...,«i  fci(ll,  rf,  allowing  him  to  reconstruct  s,  and  then 
compute  the  final  key  s. 

Parameters:  The  personal  key  consists  of  m  =  id  keys.  The  total  number  of 
key  encryptions  in  an  enabling  block  encoding  s  is  4fcfrf  log"  A. 

Fraud:  The  A  traitors  can  get  together  and  expose  their  own  keys  in  order 
to  construct  a  pirate  decoder.  By  the  bit  sensitivity  of  XOR,  the  box  must 
be  able  to  decrypt  every  «,•  (»  =  1.2 . t).  To  do  this,  the  decoder  must  be 
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able  to  decrypt  a  complete  row  for  some  a,  1  <  a  <  6.  So,  for 

each  i  (i  =  1,2,  ...,£)  the  traitors  choose  a  =  hi(u)  for  some  it  6  7\  and  d 
keys  where  uif . . . , ud  £  T  and  =  fti(«2)  =  •  = 

A,(ud)  =  a.  For  every  t,  these  d  keys  are  placed  in  the  pirate  decoder. 

Detection  of  Traitors:  Upon  confiscation  of  a  pirate  decoder,  the  set  of  keys 
in  it,  F,  is  exposed.  As  argued  above,  the  decoder  must  contain  a  block  of 
d  keys  of  the  form  6*|W> i  =  0t,a,i(«i),  •  •  ■  ,  h,atd  =  </»>,d(ud )  corresponding  to 
each  i  (i  =  1,2,. .  (If  more  than  one  row  is  in  the  decoder,  only  the  one 
with  minimum  a  is  used  by  the  detection  algorithm.)  For  each  j  —  the 

detective  identifies  the  users  in  gf*  Each  of  these  users  is  called  marked . 

All  users  who  are  marked  at  least  d/\ogk  times,  are  suspects  for  s,*.  The  user 
who  is  a  suspect  for  the  largest  number  of  Vs  is  identified  as  a  traitor. 

Goal:  We  want  to  show  that  there  is  a  choice  of  hash  functions  such  that  for 
all  coalitions,  a  good  user  is  never  identified  as  a  traitor. 

Consider  a  specific  user,  say  1,  and  a  specific  coalition  T  of  k  traitors  (which 
does  not  include  1).  We  first  bound  the  probability  that,  user  1  will  be  a  sus¬ 
pect  for  Si.  The  first  level  hash  function  6,  partitions  the  users  to  k  subsets 
{/it-1(l), . . . ,  /i“]  (6)} .  The  expected  maximum  number  of  traitors  in  these  k 
subsets  is  log  k/  log  log  k,  The  probability  that  user  1  is  hashed  to  a  subset  to¬ 
gether  more  than  log  A;  traitors  is  at  most  1/166  [2].  Denote  fc$(l)  =  a.  Consider 
the  conditional  probability  space  where  Tnh~l(a)  contains  indeed  at  most  log  k 
traitors.  In  this  conditional  space,  the  d  keys  6ii(I>i, . .  .,6t*(ai<j  in  the  pirate  de¬ 
coder  come  from  the  personal  keys  oiTnhfi(a).  As  this  set  contains  fewer  than 
log k  members,  there  must  be  at  least  one  member  in  TC\hfl(a)  who  is  marked 
at  least  d/\ogk  times.  Therefore  at  least  one  member  of  T  is  a  suspect  for  s*. 

Returning  to  our  innocent  user  L,  the  detective  marks  user  1  with  respect  to 
gitaj  if  there  is  some  u  €  TC\hfl(a)  such  that  </jiaj(!)  =  gitaj(u ).  The  range  of 
contains  4  log2  k  keys.  At  most  log  k  of  these  are  in  gita  j  ( T  H  /i^"1(a)) .  So 
the  probability  that  user  1  is  marked  with  respect  to  gi^j  is  at  most  1/(4  log  k). 
The  expected  number  of  times  user  1  will  be  marked,  with  respect  to  the  d 
functions  git<l,i ,  • .  . ,  </tj0ld,  is  d/(41og6).  We  use  the  Chernoff  bound  to  estimate 
the  probability  that  user  1  is  a  suspect  for  Si . 

Set  Xj  =  1  if  user  1  is  marked  with  respect  to  and  Xj  —  0  otherwise. 
Then  Pr(Xj  =  1)  <  1/(4  log6).  By  the  version  of  the  Chernoff  bound  mentioned 
above,  with  p  =  1/(4  log6)  and  0  =  4, 


Pr  YXj  > 

\df?i  logA: 


d/4  log  k 


^  2~3dl4\ogk 


Setting  d  =  2  log2  6,  the  conditional  probability  that  user  1  is  a  suspect  for  Si 
is  at  most  2“,ilogfc^2  <  1/166.  The  probability  of  the  condition  not  happening 
is  at  most  1/166.  So  overall,  the  total  (unconditional)  probability  that  user  1  is 
the  suspect  for  &i  is  at  most  1/86. 
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For  i  =  1, , 
Then 


-,L  let  Yi  =  1  if  1  is  the  suspect  for  and  V;  =  0  otherwise. 


So  with  probability  at  least  1  —  2“^*.  user  1  is  a  suspect  for  fewer  than  iik  of 
the  8j. 

For  every  $,  (i  =  1, . .  M^),  at  least  one  member  of  T  is  a  suspect  for  T 
contains  k  traitors,  and  so  there  must,  be  one  or  more  traitor  who  is  a  suspect  for 
at  least  t/k  s*  s.  Therefore  the  probability  that  user  1  is  mistakenly  identified  as 
a  traitor  is  smaller  than  2~*/*.  The  probability  that  for  one  of  the  (”)  possible 
coalitions  T  of  size  k,  some  good  user  is  mistakenly  identified,  is  smaller  than 
n  (*)  ^  •'  Setting  f  =  fc2  logn,  this  probability  is  smaller  than  1.  This  means 

that  there  exists  a  choice  of  hash  functions  hi  and  such  that  a  good  user  is 
never  mistakenly  identified  as  a  traitor.  The  resulting  open  Jfc- traceability  scheme 
has  parameters  m  =  id  =  2 k2  log2  *logn  and  r  =  2k(d\og2  k  =  4ib3 log4  Jblogn. 

3.3  A  Secret  One  Level  Scheme 

We  simplify  the  construction  and  improve  its  costs  by  using  a  secret  scheme. 
The  proposed  scheme  is  one  level,  and  the  hash  values  of  users  are  kept  secret. 
The  major  source  of  saving  is  that  it.  suffices  to  map  the  n  users  into  a  set  of  4* 
keys  (rather  than  k 2  keys  as  in  the  simple  one  level  scheme).  A  coalition  of  size 
k  will  contain  the  key  of  any  specific  user  with  constant  probability.  However,  as 
the  traitors  do  not  know  which  key  this  is,  any  key  they  choose  to  insert  into  the 
pirate  decoder  will  miss  the  key  of  the  authorized  user  (with  high  probability). 

Initialization:  Each  user  u  (u  €  {1, 2, . . .,  n})  is  assigned  a  random  name  nu 
from  a  universe  U  of  size  exponential  in  n.  These  names  are  kept  secret.  A  set  of 
t  hash  functions  hx ,  . . . ,  h(  are  chosen  independently  at  random.  Each  hash 

function  hi  maps  U  into  a  set  of  4k  random  keys  S\  =  2f _ s*  4JL.}.  The 

hash  functions  are  kept  secret  as  well.  User  u  receives,  upon  initialization,  (  keys 
{*>1 

Distributing  a  Key:  For  each  i  {i  =1,2 . t)  the  data  supplier  encrypts  a 

key  Si  under  each  of  the  4 k  keys  in  St.  The  final  key  is  the  bitwise  XOR  of  the 
fii  s  (the  j-ili  bit  is  Each  authorized  user  has  one  key  from 

Si,  so  he  can  decrypt,  every  s,-,  and  thus  compute  s. 

Parameters:  The  memory  required  per  user  is  m  =  (  keys.  The  total  number 
of  broadcasts,  used  in  distributing  the  key  s,  is  r  =  4 Id. 

riaud.  The  k  traitors  can  get  together  and  expose  their  own  keys.  Given 

these  keys,  they  chose  one  key  per  set  Si  (t  =  1,2 . C).  These  i  keys  arc  put 

together  in  a  pirate  decoder.  This  set  of  keys  F,  enables  the  purchaser  of  such 
decoder  to  decrypt  every  ,  and  thus  compute  s. 

Detection  of  Traitors:  Upon  confiscation  of  a  pirate  decoder,  the  set.  of  keys 
in  it.  F,  is  exposed.  F  contains  t  keys,  one  per  set  Si.  Denote  these  keys  by 
fi  e  Si.  For  each  i,  the  users  in  li~'{fi)  are  identified  and  marked.  The  user 
with  largest  number  of  marks  is  exposed. 
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Goal:  We  want  to  show  that  for  all  (almost  all)  coalitions,  the  probability  of 
exposing  a  user  who  is  not  a  traitor  is  negligible. 

Clearly,  at  least  one  of  the  traitors  contributes  at  least  i/k  of  the  keys  to  the 
pirate  decoder.  Wc  want  to  show  that  the  probability  that  a  good  user  is  marked 
£/k  times  is  negligible.  Consider  a  specific  user,  say  ],  and  a  specific  coalition 
T  of  k  traitors  (which  does  not  include  1).  As  the  name  assigned  to  user  1  is 
random  and  the  hash  functions  are  random,  the  value  o»  =  h,-(m)  is  uniformly 
distributed  in  Si ,  even  given  the  k  values  hashed  by  /i,  from  the  names  of  the 
coalition  members.  The  probability  that  the  value  chosen  by  the  coalition  to 
the  pirate  decoder,  equals  a*  is  therefore  q  =  1/4 k.  Let  be  a  zero-one  random 
variable,  where  X \  =  1  if  a,-  =  /*.  The  mean  value  of  By  the 

Chernoff  bound 

In  order  to  overcome  all  but  p  of  the  (”)  coalitions  and  all  n  choices  of  users, 
we  choose  i  satisfying  n  .2~3CtJ]k  <  That  is,  Ak  log(?t/p)/3  <  f,  which  gives 

Theorems.  There  is  a  (p,  k) -resilient  secret  traceability  scheme,  where  a  user 
personal  key  consists  of  m  =  4fclog(n/p)/3  decryption  keys,  and  an  enabling 
block  consists  of  log(n/p)/3  key  encryptions. 

4  Lower  Bounds 

In  this  section  we  derive  lower  bounds  for  the  case,  where  incrimination  has 
to  be  absolute,  i.e.  with  no  error  probability.  We  assume  that  the  keys  the  data 
supplier  distributes  to  the  users  are  unforgeable.  This  is  not  accurate,  since  there 
is  always  the  small  chance  that  the  adversary  guesses  the  keys  of  the  user  it  wants 
to  incriminate.  However,  we  distinguish  between  the  probability  of  guessing  the 
keys  (which  is  exponentially  small  in  the  length  of  the  key)  and  the  probability 
of  incrimination  for  other  reasons  which  we  would  like  to  be  zero.  Our  view  of 
the  system  is  therefore  as  follows:  let  the  set  of  keys  used  be  S  =  {$i ,  $2,  *  •  *  sr } 
and  let  each  user  i  obtain  a  subset  Ui  C  S  of  size  m. 

Claim  1  If  no  coalition  of  k  users  should  be  able  to  incriminate  a 

userid  £  {^1,1*2,  ■  •  then  for  all  such  io,  ,  12, . .  .  i*.  we  should  have  that 

Uio  <t  ujL,  Ui, 

Proof  Suppose  not,  i.e.  there  exist  i0>*i _ h  such  that  Ui0  C  U*=1  then 

the  coalition  of  *1,12,  •  *4  can  reconstruct  the  keys  of  Ui0  and  put  them  in  the 
pirate  decoder  for  sale.  Anyone  examining  the  contents  of  the  box  will  have  to 
deduce  that  io  is  the  traitor  that  generated  it. 
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Luckily,  the  issue  of  set  systems  obeying  the  conditions  of  Claim  1  has  been 
investigated  by  Erdos,  Ftankl  and  Fiiredi  [4].  From  Theorem  3.3  and  Proposition 
3.4  there  we  can  deduce  that  r  is  f?(min KfcVsPfQgn})  and  from  Proposition 
2.1  there  we  get  that  w  >  Hence  we  have: 

Theorem  6.  In  any  open  k-rctrilitnl  traceability  scheme  distributing  every  one 
of  the  n  user  in  keys  out  of  r  we  have  that  r  is  ■})  and  m  > 

iJpgn  g  g 

*  log  r  • 

Note  that  the  lower  bounds  on  both  r  and  rn  are  roughly  a  factor  of  k  smaller 
than  the  best  construction  we  have  for  an  open  traceability  system. 
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Abstract 

We  construct  a  public  key  encryption  scheme  in  which  there  is  one  public  encryption  key,  but 
many  private  decryption  keys.  If  some  digital  content  (e.g.,  a  music  clip)  is  encrypted  using  the 
public  key  and  distributed  through  a  broadcast  channel,  then  each  legitimate  user  can  decrypt 
using  its  own  private  key.  Furthermore,  if  a  coalition  of  users  collude  to  create  a  new  decryption 
key  then  there  is  an  efficient  algorithm  to  trace  the  new  key  to  its  creators.  Hence,  our  system 
provides  a  simple  and  efficient  solution  to  the  “traitor  tracing  problem”.  A  minor  modification  to 
the  scheme  enables  it  to  resist  an  adaptive  chosen  ciphertext  attack.  Our  techniques  apply  error 
correcting  codes  to  the  discrete  log  representation  problem. 


1  Introduction 

Consider  the  distribution  of  digital  content  to  subscribers  over  a  broadcast  channel.  Typically,  the 
distributor  gives  each  authorized  subscriber  a  hardware  or  software  decoder  (“box”)  containing  a 
secret  decryption  key.  The  distributor  then  broadcasts  an  encrypted  version  of  the  digital  content. 
Authorized  subscribers  are  able  to  decrypt  and  make  use  of  the  content.  This  scenario  comes  up  in 
the  context  of  pay-per-view  television,  and  more  commonly  in  web  based  electronic  commerce  (e.g. 
broadcast  of  online  stock  quotes  or  broadcast  of  proprietary  market  analysis). 

However,  nothing  prevents  a  legitimate  subscriber  from  giving  a  copy  of  her  decryption  software 
to  someone  else.  Worse,  she  might  try  to  expose  the  secret  key  buried  in  her  decryption  box  and  make 
copies  of  the  key  freely  available.  The  “traitor”  would  thus  make  all  of  the  distributor’s  broadcasts 
freely  available  to  non-subscribers.  Chor,  Fiat  and  Naor  [5]  introduced  the  concept  of  a  traitor  tracing 
scheme  to  discourage  subscribers  from  giving  away  their  keys.  Their  approach  is  to  give  each  subscriber 
a  distinct  set  of  keys  that  both  identify  the  subscriber  and  enable  her  to  decrypt.  In  a  sense,  each  set 
of  keys  is  a  “watermark”  that  traces  back  to  the  owner  of  a  particular  decryption  box.  A  coalition 
of  traitors  might  try  to  mix  keys  from  many  boxes,  to  create  a  new  pirate  box  that  can  still  decrypt 
but  cannot  be  traced  back  to  them.  A  traitor  tracing  scheme  is  “fc-collusion  resistant”  if  at  least  one 
traitor  can  always  be  identified  when  k  of  them  try  to  cheat  in  this  way.  In  practice,  especially  with 
tamper-resistant  decryption  boxes,  it  may  suffice  for  fc  to  be  a  fairly  small  integer,  e.g.,  on  the  order 
of  20. 

In  this  paper  we  present  an  efficient  public  key  traitor  tracing  scheme.  The  public  key  settings 
enable  anyone  to  broadcast  encrypted  information  to  the  group  of  legitimate  receivers.  Previous 
solutions  were  combinatorial  with  probabilistic  tracing  [5,  11,  15,  16,  17],  and  could  be  either  public- 
key  or  symmetric-key.  Our  approach  is  algebraic  with  deterministic  tracing,  and  is  inherently  public- 
key.  Our  approach  is  much  more  efficient  than  the  public-key  instantiations  of  previous  combinatorial 
constructions.  One  earlier  construction  was  algebraic  in  nature  [9],  although  it  was  later  determined 
to  be  insecure  [18]. 
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Previous  approaches  [5,  11]  incur  an  overhead  that  is  proportional  to  the  logarithm  of  the  size  of 
the  population  of  honest  users.  In  a  commercial  setting  such  as  a  web  broadcast  or  pay-per-view  tv, 
where  the  number  of  subscribers  might  be  in  the  millions,  this  is  a  significant  factor.  Our  approach 
eliminates  this  factor.  Furthermore,  secret  keys  in  our  scheme  are  very  short.  Each  private  key  is  just 
the  discrete  log  of  a  single  element  of  a  finite  field  (e.g.,  as  small  as  160  bits  in  practice).  The  size  of 
an  encrypted  message  is  just  2A:  + 1  elements  of  the  finite  field.  The  work  required  to  encrypt  is  about 
2fe  +  1  exponentiations.  Decryption  takes  far  less  than  2fc  4-  1  exponentiations.  During  decryption, 
only  the  final  exponentiation  uses  the  private  key,  which  can  be  helpful  when  the  secret  is  stored  on 
a  weak  computational  device. 

Previous  probabilistic  tracing  methods  try  to  maximize  the  chance  of  catching  just  one  of  the 
traitors  while  minimizing  the  chance  of  accusing  an  innocent  user.  Our  tracing  method  is  deterministic. 
Innocent  users  are  never  accused,  as  long  as  the  number  of  colluders  is  at  or  below  the  collusion 
threshold,  and  as  long  as  the  complexity  assumption  remains  true.  Even  when  more  than  k  (but  less 
than  2k)  traitors  collude,  some  information  about  the  traitors  can  be  recovered. 

The  intuition  behind  our  system  is  as  follows.  Each  private  key  is  a  different  solution  vector  for 
the  discrete  log  representation  problem  with  respect  to  a  fixed  base  of  field  elements.  We  can  show 
that  the  pirate  is  limited  to  forming  new  keys  by  taking  convex  combinations  of  stolen  keys.  If  every 
set  of  2fc  keys  is  linearly  independent,  then  every  convex  combination  of  k  keys  can  be  traced  uniquely 
(but  not  necessarily  efficiently).  By  deriving  our  keys  from  a  Reed-Solomon  code  in  the  appropriate 
way,  we  can  take  advantage  of  efficient  error  correction  methods  to  trace  uniquely  and  efficiently.  We 
note  that  the  multi-dimensional  discrete  log  representation  problem  has  been  previously  used,  e.g.,  for 
incremental  hashing  [1]  and  Signets  [7]. 

Our  scheme  is  traceable  if  the  discrete  log  problem  is  hard.  The  encryption  scheme  is  secure 
(semantic  security  against  a  passive  adversary)  if  the  decision  Diffie-Hellman  problem  is  hard.  A  small 
modification  yields  security  against  an  adaptive  chosen  ciphertext  attack  under  the  same  hardness 
assumption.  That  level  of  protection  can  be  important  in  distribution  scenarios  where  the  decryption 
boxes  (or  decryption  software)  are  widely  deployed  and  largely  unsupervised. 

In  Section  2  we  give  definitions  for  the  traitor  tracing  problem.  Our  basic  scheme  and  a  non¬ 
black-box  tracing  algorithm  is  described  in  Section  3.  Black  box  tracing  against  an  arbitrary  pirate  is 
detailed  in  Section  4.  Blade  box  tracing  against  a  restrictive  “single-key  pirate”  is  given  in  Section  5. 
Chosen  ciphertext  security  is  considered  in  Section  6,  and  an  open  problem  is  discussed  in  Section  7. 
Conclusions  are  given  in  Section  8,  including  an  application  of  our  scheme  to  defending  against  software 
piracy. 

2  Definitions 

For  a  detailed  presentation  of  the  traitor  tracing  model,  see  [5].  A  public  key  traitor  tracing  encryption 
scheme  is  a  public  key  encryption  system  in  which  there  is  a  unique  encryption  key  and  multiple 
decryption  keys.  The  scheme  is  made  up  of  four  components: 

Key  Generation:  The  key  generation  algorithm  takes  as  input  a  security  parameter  s,  a  number  £ 
of  private  keys  to  generate,  and  a  number  k  which  we  call  the  collusion  bound.  It  outputs  a 
public  encryption  key  e  and  a  list  of  distinct  private  decryption  keys  di, . ..  ,d*.  Any  decryption 
key  can  be  used  to  decrypt  a  ciphertext  created  using  the  encryption  key. 

Encryption:  The  encryption  algorithm  takes  a  public  encryption  key  e  and  a  message  M  and  outputs 
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a  ciphertext  C. 

Decryption:  The  decryption  algorithm  takes  a  ciphertext  C  and  any  of  the  decryption  keys  di  and 
outputs  the  message  M.  This  is  an  “open”  scheme  in  the  sense  that  only  the  short  decryption 
keys  are  secret  while  the  decryption  method  can  be  public. 

The  pirate:  The  pirate  is  a  party  (an  algorithm)  that  is  given  the  public  key  and  k  random  decryption 
keys  di,...,dk  from  the  set  of  all  l  keys.  Using  the  public  key  and  the  k  private  keys,  the 
pirate  creates  a  pirate  decryption  box  (or  decryption  software)  V.  The  pirate  box  V  must 
correctly  decrypt  all  but  a  negligible  fraction  of  the  valid  ciphertexts  generated  by  the  encryption 
algorithm. 

Tracing:  The  encryption  scheme  is  said  to  be  “k-resilient”  if  there  is  a  tracing  algorithm  that,  given 
D  created  by  some  pirate,  and  all  random  bits  used  during  key  generation,  determines  at  least 
one  of  the  d,’s  in  the  pirate’s  possession  used  to  create  V.  The  tracing  algorithm  is  said  to  be 
“black  box”  if  its  only  use  of  V  is  as  an  oracle  to  query  on  various  inputs. 

Clearly  black  box  tracing  is  preferable  to  tracing  algorithms  that  rely  on  information  embedded 
in  the  implementation  of  V.  After  all,  extracting  information  from  the  implementation  of  T>  might 
be  difficult:  the  software  executable  might  be  obfuscated,  or  V  might  be  implemented  in  a  temper 
resistant  device.  Black  box  tracing  is  also  important  for  proving  that  we  do  not  assume  the  pirate 
must  embed  specific  information  in  the  pirate  decoder. 

We  describe  black  box  tracing  in  more  detail.  One  notion  of  black  box  tracing  is  that  the  tracer 
uses  the  pirate  decoder  as  a  decryption  oracle.  Given  a  string  C  the  pirate  decoder  outputs  either 
(1)  “invalid”  meaning  C  is  not  a  valid  ciphertext,  or  (2)  outputs  a  plaintext  M  which  it  claims  is 
the  decryption  of  C.  In  this  model  the  tracer  sees  the  full  output  of  the  pirate  decoder,  namely  the 
decryption  of  C.  In  some  settings  such  full  access  to  the  pirate  decoder  may  not  be  possible.  For 
example,  consider  a  pirate  decoder  that’s  intended  to  play  music.  The  decoder  is  given  an  encrypted 
music  clip  and  either  plays  the  music,  or  says  that  the  given  music  clip  is  invalid.  When  the  tracer 
queries  the  pirate  decoder  its  only  feedback  is  whether  the  decoder  plays  the  music  or  outputs  an 
error.  In  other  words,  all  the  tracer  learns  is  whether  the  decoder  was  successful  in  decrypting  the 
given  ciphertext  or  not.  The  tracer  does  not  see  the  full  output  of  the  pirate  decryption  algorithm. 
Hence,  we  can  define  two  types  of  black  box  tracing: 

Full  access  black  box  tracing:  In  this  model  the  tracer  issues  a  query  C  to  the  pirate  decoder. 
The  decoder  returns  either  (1)  a  plaintext  M  which  is  supposedly  the  decryption  of  C,  or  (2)  it 
returns  invalid  meaning  the  ciphertext  C  is  invalid.  The  decoder  can  behave  maliciously  and 
return  either  one  of  the  above  for  any  query.  When  the  decoder  returns  a  message  M  it  is  free 
to  choose  the  message  maliciously.  However,  if  C  is  a  well  formed  ciphertext  the  decoder  must 
return  the  decryption  of  C. 

Minimal  access  black  box  tracing:  In  this  model  the  queries  to  the  pirate  decoder  are  a  pair 
(C,  M)  where  C  is  some  ciphertext  query  and  M  is  some  plaintext.  The  pirate  decoder  returns 
either  (1)  valid  meaning  C  is  a  valid  encryption  of  M,  or  (2)  invalid  meaning  C  is  not.  The 
decoder  can  behave  maliciously  and  return  either  one  of  the  above  for  any  query.  However,  if  C 
is  a  well  formed  ciphertext  which  is  an  encryption  of  M  the  pirate  decoder  must  return  valid. 

In  Section  4  and  Section  5  we  give  two  black  box  tracing  algorithms.  The  first  only  requires  minimal 
black  box  access  to  the  decoder.  The  second  requires  full  black  box  access. 
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Our  definition  of  minimal  access  fits  well  with  the  music  player  analogy.  Consider  a  music  player 
that  takes  an  encrypted  music  clip  and  either  plays  the  music  or  says  that  the  clip  is  invalid.  The 
music  player  must  play  all  well  formed  music  clips.  A  well  formed  music  clip  is  formated  as  (C,  EK(S)) 
where  E  is  a  semantically  secure  encryption  scheme,  K  is  a  symmetric  key  used  to  encrypt  the  music 
5,  and  C  is  a  proper  encryption  of  K  using  the  traitor  tracing  system.  Suppose  the  traitor  tracing 
scheme  supports  minimal  access  black  box  tracing.  We  trace  a  pirate  music  player  as  follows:  when  the 
tracer  issues  a  query  (C,  K)  we  pick  a  random  peice  of  music  S  and  create  the  music  clip  (C,  EK{S)). 
If  the  player  plays  the  music  clip  we  respond  to  the  tracer  query  with  valid.  Otherwise,  we  respond 
with  invalid.  The  tracing  algorithm  will  correctly  identify  one  of  the  pirate  keys. 

Throughout  the  paper  we  assume  the  pirate  boxes  are  stateless.  That  is,  the  pirate  decoder  does 
not  respond  to  decryption  requests  based  on  its  responses  to  previous  requests.  This  does  not  seem 
to  be  a  problem  since  the  scheduling  of  tracing  queries  can  be  randomized,  and  furthermore,  tracer 
queries  can  be  interspersed  within  regular  decryption  requests. 

We  point  out  that  our  definition  of  the  pirate  says  that  the  pirate  obtains  a  random  subset  of  size 
k  of  the  l  private  keys.  The  pirate  does  not  choose  which  private  keys  he  receives.  This  is  a  a  natural 
definition  since  we  are  assuming  that  keys  are  assigned  to  users  in  a  random  fashion:  when  a  new  user 
buys  a  decryption  box  the  user  receives  a  box  containing  a  randomly  chosen  unassigned  private  key. 
Hence,  even  if  the  pirate  breaks  into  boxes  belonging  to  k  users  of  his  choice ,  the  pirate  still  obtains 
a  set  of  k  random  keys. 

Representations;  Our  traitor  tracing  scheme  relies  on  the  representation  problem.  When  y  = 
FUV  ™  say  that  (^li •  *  * » &2k)  is  a  “representation”  of  y  with  respect  to  the  base  hi,...  ,fc2A:*  If 
dm  are  representations  of  y  with  respect  to  the  same  base,  then  so  is  any  “convex  combination” 
of  the  representations:  d  =  where  at*, . . . ,  am  are  scalars  such  that  t  Qi  =  1* 

Notation:  We  let  u  •  v  denote  the  inner  product  of  the  two  vectors  u  and  v. 

3  The  encryption  scheme 

We  are  now  ready  to  present  our  tracing  traitor  encryption  scheme.  Let  s  be  a  security  parameter 
and  k  be  the  maximal  coalition  size.  Our  scheme  defends  against  any  collusion  of  at  most  k  parties. 
We  wish  to  generate  one  public  key  and  i  corresponding  private  keys.  Without  loss  of  generality  we 
assume  i  >  2k  +  2  (if  £  <  2k  +  2  we  set  l  =  2k  +  2  and  generate  i  private  keys). 

Our  scheme  makes  use  of  a  certain  linear  space  tracing  code  F  which  is  a  collection  of  l  codewords 
in  Z2k .  The  construction  of  the  set  T  and  the  properties  it  has  to  satisfy  are  described  in  the  next 
section.  For  now,  it  suffices  to  view  the  l  words  in  T  as  vectors  of  integers  of  length  2k.  The  set 
T  =  {7M, . . . ,  7W}  is  fixed  and  publicly  known. 

Let  Gq  be  a  group  of  prime  order  q.  The  security  of  our  encryption  scheme  relies  on  the  difficulty 
of  computing  discrete  log  in  Gq.  More  precisely,  the  security  is  based  on  the  difficulty  of  the  Decision 
Diffie-Hellman  problem  [3]  in  Gq  as  discussed  below.  One  can  take  as  Gq  the  subgroup  of  Z*  of  order 
q  where  p  is  a  prime  with  g|p  —  1-  Alternatively,  one  can  use  the  group  of  points  of  an  elliptic  curve 
over  a  finite  field. 

Key  generation:  Perform  the  following  steps: 
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1.  Let  g  £  Gq  be  a  generator  of  Gq. 

2.  For  i  =  1, . . . ,  2k  choose  a  random  n  £  F,  and  compute  hj  =  gTi . 

3.  The  public  key  is  {y,hu...,  h2k),  where  y  =  IIi=i  hT  for  random  <*i, . . . ,  a2k  £  F,. 

4.  A  private  key  is  an  element  such  that  •  7^  is  a  representation  of  y  with  respect  to  the 

base  h\,...,h2k.  The  i’th  key,  0*,  is  derived  from  the  i’th  codeword  7W  =  (71,..., 72k)  6  T  by 

2fc  2k 

Oi  =  &rjaj)/(*T/rj'vj)  (mod  q)  (1) 

3= 1  j=l 

To  simplify  the  exposition  we  frequently  refer  to  the  private  key  as  being  the  representation 
7®.  Note  however  that  only  0*  needs  to  be  kept  secret  since  the  code  T  is  public.  One 
can  verify  that  di  is  indeed  a  representation  of  y  with  respect  to  the  base  Ai, . . . , 

Encryption:  To  encrypt  a  message  M  in  Gq  do  the  following:  first  pick  a  random  element  a  G  ¥q . 
Set  the  ciphertext  C  to  be 

C  =  (M  •  ya,  hi,  ha2k) 

Decryption:  To  decrypt  a  ciphertext  C  —  ,H2k)  using  user  i’th  secret  key,  0{,  compute 

2k 

M  =  S/U8i  where  U  =  J]  H]j 

j= 1 

Here  7W  =  (71, . . .  ,72*)  €  T  is  the  codeword  from  which  0t  is  derived.  The  cost  of  computing  U 
is  far  less  than  2k  +  1  exponentiations  thanks  to  simultaneous  multiple  exponentiation  [10,  p.  618 
].  Also  note  that  U  can  be  computed  without  knowledge  of  the  private  key,  leaving  only  a  single 
exponentiation  by  the  private  key  holder  to  complete  the  decryption. 

Before  going  any  further  we  briefly  show  that  the  encryption  scheme  is  sound,  i.e.  any  private 
key  d{  correctly  decrypts  any  ciphertext.  Given  a  ciphertext  C  =  (M  ■  ya,  . . . ,  h%k),  decryption  will 
yield  M  ■  ya/U9i  where  U  =  Then 

2k  2* 

If*  =  (II^Y<  =  (gs^7ngi°  =  (g^Tjaj)a  =  =  ya 

j=i  3=1 

as  needed.  The  third  equality  follows  from  Equation  (1).  More  generally,  it  is  possible  to  decrypt 
given  any  representation  ($1, . . . ,  S2k)  of  y  with  respect  to  the  base  hi, ... ,  h2k,  since  rij=i(^j) J  =  Va- 

Tracing  algorithm:  We  describe  our  tracing  algorithm  in  Section  3.3. 

‘A  codeword  might  not  have  an  associated  private  key  in  the  extremely  unlikely  event  that  the  denominator  is  zero 
in  the  calculation  of 
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3,1  Proof  of  security 


We  now  show  that  our  encryption  scheme  is  semantically  secure  against  a  passive  adversary  assuming 
the  difficulty  of  the  Decision  Diffie-Hellman  problem  (DDH)  in  Gq.  The  assumption  says  that  in  Gq , 
no  polynomial  time  statistical  test  can  distinguish  with  non  negligible  advantage  between  the  two 
distributions  D  =  (gu 92, 91,9%)  and  R  =  {<71,021 01,02)  where  yi,02  are  chosen  at  random  in  Gq  and 
a,  b  are  chosen  at  random  in  ¥q . 

Theorem  3.1  The  encryption  scheme  is  semantically  secure  against  a  passive  adversary  assuming 
the  difficulty  of  DDH  in  Gq. 

Proof  Suppose  the  scheme  is  not  semantically  secure  against  a  passive  adversary.  Then  there  exists 
an  adversary  that  given  the  public  key  (y,  hu . . . ,  h2k)  produces  two  messages  M0,  Mx  G  Gq.  Given  the 
encryption  C  of  one  of  these  messages  the  adversary  can  tell  with  non-negligible  advantage  e  which  of 
the  two  messages  he  was  given.  We  show  that  such  an  adversary  can  be  used  to  decide  DDH  in  Gq. 
Given  {01, 02,  til,  u2)  we  perform  the  following  steps  to  determine  if  it  is  chosen  from  R  or  D: 

Step  Is  Choose  random  r2, . . . ,  r2k  G  ¥q .  Set  y  =  gu  hi  =  y2,  and  hi  =  gr2{  for  i  =  2, . . . ,  2k. 

Step  2:  Give  (y, hi, . . . , h2k)  to  the  adversary.  Adversary  returns  M0,  Mi  G  Gq . 

Step  3:  Pick  a  random  b  G  {0, 1}  and  construct  the  ciphertext 

c  =  (Mbuuu2,u[\...,ur22k) 

Step  4:  Give  the  ciphertext  C  to  the  adversary.  Adversary  returns  6'  G  { 0,1}. 

Step  5:  If  b  =  br  output  “ D ”.  Otherwise  output  ui2”. 

Observe  that  if  the  tuple  (0i,02,ui,u2)  is  chosen  from  D,  then  the  ciphertext  C  is  an  encryption  of 
Mb.  If  the  quadruple  is  from  fl,  then  the  ciphertext  is  an  encryption  of  Mbg[ai  “°z) ,  where  ux  =  g*1  and 
u2  =  g22.  In  other  words,  the  ciphertext  is  the  encryption  of  a  random  message.  Hence  b  =  b!  holds 
with  probability  1/2.  By  a  standard  argument,  a  non-negligible  success  probability  for  the  adversary 
implies  a  non-negligible  success  probability  in  deciding  DDH.  □ 


3.2  Constructing  new  representations 

To  decrypt,  it  suffices  to  know  any  representation  of  y  with  respect  to  the  base  hu...,h2k.  We 
have  already  noted  that  if  Ji, . . .  ,dm  G  E®*  are  representations  of  y  then  any  convex  combination  of 
di,  *  •  • ,  dm  is  also  a  representation  of  y.  The  following  lemma  shows  that  convex  combinations  are  the 
only  new  representations  of  y  that  can  be  efficiently  constructed  from  . . .  ,dm  g 

Lemma  3.2  Let  (y,  /ii, . . .  <,h2k)  be  a  public  key.  Suppose  an  adversary  is  given  the  public  key  and  m 
private  keys  d\ , . . . ,  dm  G  for  m  <  2k  —  1 .  If  the  adversary  can  generate  a  new  representation  d  of 
y  with  respect  to  the  base  hi,...,h2k  that  is  not  a  convex  combination  of  d\ , . . . , dm  then  the  adversary 
can  compute  discrete  logs  in  Gq. 

Proof  Let  0  be  a  generator  of  Gq.  Suppose  we  are  given  z  =  gx.  We  show  how  to  use  the  adversary 
to  compute  x.  Choose  random  6,  ru . . . ,  r2fc,  su . . . ,  s2k  G  Zq.  Construct  the  set  {hi , . . . ,  h2k }  where 
hi  —  z  *g  1  for  all  t,  1  <  i  K  2k.  Compute  y  =  0^.  Find  m  linearly  independent  (and  otherwise  ran¬ 
dom)  solutions  c*i, . . . ,  6tm  to  ot  *  f  =  0  mod  q  and  ot  •  s  —  b  mod  q.  These  771  vectors  are  representations 
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of  y  with  respect  to  the  base  h\,. , h>2k •  Suppose  that  the  adversary  can  find  another  representation 
fj  that  is  not  a  convex  combination  of  Si, . . . ,  am.  Then  p  -f  ^  0  mod  q  with  probability  at  least  1  —  q- 
But  then  (b  -  ft  ■  s)(p  ■  f)~l  is  the  discrete  log  of  z.  □ 


3.3  Non-Black-Box  Tracing 

We  now  turn  our  attention  to  a  tracing  algorithm  for  our  basic  encryption  scheme.  Throughout  this 
subsection  we  assume  the  pirate  decoder  contains  at  least  one  representation  of  y.  Furthermore,  we 
assume  that  by  pramming  the  decoder  implementation  it  is  possible  to  obtain  one  of  these  represen¬ 
tations,  d.  Hence,  the  tracing  algorithm  presented  in  this  section  is  not  black  box:  it  assumes  one 
can  extract  at  least  one  representation  of  y  from  the  pirate  decoder.  Whenever  it  is  easy  to  reverse 
engineer  the  pirate  decoder  this  algorithm  directly  exposes  the  keys  used  by  the  pirate.  In  the  next 
section  we  show  how  to  convert  this  algorithm  into  a  black-box  tracing  algorithm  that  exposes  the 
pirate  keys  without  reverse  engineering  the  pirate  decoder. 

Suppose  the  pirate  obtains  k  keys  di, . . . ,  d*.  Let  d  be  the  representation  of  y  found  in  the  pirate 
decoder.  Then  by  Lemma  3.2  we  know  that  d  must  lie  in  the  linear  span  of  the  representations 
Jj, . . . , dj..  We  construct  a  tracing  algorithm  that  given  d  outputs  one  of  di, . . . ,  dfc. 

Recall  that  the  construction  of  private  keys  made  use  of  a  set  T  C  containing  £  codewords. 
Tj'.ar-b  of  the  £  users  is  given  a  private  key  d*  <E  which  is  a  multiple  of  a  codeword  in  T.  To  solve  the 
tracing  problem  we  must  construct  a  set  V  C  containing  £  codewords  with  the  following  property. 

Let  d  be  a  point  in  the  linear  span  of  some  k  codewords  7^, . .  • ,  7^  €  T.  Then  at  least  one  7 
in  7^,...,  7^  must  be  a  member  of  any  coalition  (of  at  most  fc  users)  that  can  create  d.  This  7 
identifies  one  of  the  private  keys  that  must  have  participated  in  the  const  ruction^  of  the  pirated  key  d. 
Furthermore,  there  should  exist  an  efficient  tracing  algorithm  that  when  given  d  as  input,  outputs  7. 
In  fact,  our  tracing  algorithm  will  output  every  7W  that  has  nonzero  weight  in  the  linear  combination. 


The  set  T:  We  begin  by  describing  the  set  F  containing  £  codewords  over  Since  q  is  a  large 
primp  we  may  assume  q  >  max(^ ,  2k).  Consider  the  following  (£  —  2k)  x  £  matrix: 


1  1 

2  3 

22  32 

23  33 


1 

£ 

e 

£3 


\ 


(mod  g) 


2*-2fc-i 


3*-2jfc-l 


1 


) 


Observe  that  any  vector  in  the  span  of  the  rows  of  A  corresponds  to  a  polynomial  of  degree  at  most 
£  —  2k  —  l  evaluated  at  the  points  1, ...,£. 

Let  &i,...,  &2/t  be  a  basis  of  the  linear  space  of  vectors  satisfying  Ax  =  0  mod  q.  Viewing  these  2k 
vectors  as  the  columns  of  a  matrix  we  obtain  an  £  x  2k  matrix  B: 


/III  I 

B  =  I  62  63  •  •  •  &2k 

VIII  1 
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We  define  T  as  the  set  of  rows  of  the  matrix  B.  Hence,  T  contains  t  codewords  each  of  length  2k. 
We  note  that  using  Lagrange  interpolation  one  can  directly  construct  the  i’th  codeword  in  T  using 
approximately  t  arithmetic  operations  modulo  q. 

Non-Black-Box  tracing  algorithm:  Consider  the  set  of  vectors  in  T.  Let  d  e  l^fe  be  a  vector 
formed  by  taking  a  linear  combination  of  at  most  k  vectors  in  P.  We  show  that  given  d  one  can 
efficiently  determine  the  unique  set  of  vectors  in  T  used  to  construct  d.  Since  the  vectors  in  T  form 
the  rows  of  the  matrix  B  above  we  know  there  exists  a  vector  w  G  1F^  of  Hamming  weight  at  most  k 
such  that  w  •  B  =  d.  We  show  how  to  recover  the  vector  w  given  d. 

Step  1:  Find  a  vector  v  e  IF*  such  that  v  ■  B  =  d.  Many  such  vectors  exist.  Choose  one  arbitrarily.2 
Since  (t>  -  w)  ■  B  =  0  we  know  that  v  -  w  is  in  the  linear  span  of  the  rows  of  the  matrix  A  (the 
rows  of  A  span  the  space  of  vectors  orthogonal  to  the  columns  of  B).  In  other  words,  there  exists 
a  unique  polynomial  /  e  FJx]  of  degree  at  most  £  -  2fc  -  1  such  that  v-w  =  (/( 1), . . . ,  /(£)). 

Step  2:  Since  w  has  Hamming  weight  at  most  fc,  we  know  that  (/( 1), . .  .,/(£)}  equals  v  in  all  but  k 
components.  Hence,  using  Berlekamp’s  algorithm  [2]  we  can  find  /  from  v.  The  polynomial  / 
gives  us  the  vector  v  —  w  from  which  we  recover  w  as  required. 

For  completeness  we  briefly  recall  Berlekamp’s  algorithm.  The  algorithm  enables  us  to  find  /  given 
the  vector  v  6  Fj.  Let  g  be  a  polynomial  of  degree  at  most  k  such  that  g(i)  =  0  for  alii  =  1, ...  ,£ 
for  which  f(i)  v,  (where  v{  is  the  t’th  component  of  v).  Then  we  know  that  for  all  i  =  1, . . . ,  £  we 
have  f(i)g{i)  =  g(i)vi.  The  polynomial  fg  has  degree  at  most  £  -  k  -  1.  Hence,  we  get  i  equations 
(for  each  of  *  =  1,. . .  ,t)  in  t  variables  (the  variables  are  the  coefficients  of  the  polynomials  fg  and  g, 
where  the  leading  coefficient  of  g  is  1).  Let  h  and  g  be  a  solution  where  g  is  a  non-zero  polynomial: 
h  is  a  polynomial  of  degree  at  most  t  -  k  -  1  and  g  is  of  degree  at  most  fc.  We  know  that  whenever 
/(*)  =  Vi  (i.e.  at  £  -  fc  points)  we  have  h{i)  ~  g(i)Vi  =  g(i)f(i).  It  follows  that  /  =  h/g. 

This  completes  the  description  of  the  tracing  algorithm.  Our  tracing  algorithm  satisfies  several 
properties: 

Error  free  tracing  The  tracing  algorithm  is  deterministic  in  the  sense  that  there  is  no  error  prob¬ 
ability.  Any  key  output  by  the  tracing  algorithm  must  have  participated  in  the  construction  of 
the  pirated  key. 

Beyond  threshold  tracing  If  more  than  fc  parties  colluded  to  create  the  pirated  key,  then  Berlekamp’s 
algorithm  may  fail  to  recover  the  polynomial  /,  and  tracing  will  fail.  Above  this  bound,  recent 
results  of  Guruswami  and  Sudan  [8]  may  be  used  to  output  a  list  of  candidate  polynomials  for 
/.  The  tracer  gets  a  list  of  “leads”  for  the  fraud  investigation  that  includes  the  actual  colluders. 
This  will  be  effective  against  coalitions  of  size  at  most  2fc  -  1. 

Running  time  The  tracing  algorithm  requires  that  we  solve  a  linear  system  of  dimension  l  (the  total 
number  of  users).  A  naive  implementation  runs  in  time  0(£2)  (field  operations).  Asymptotically 
efficient  versions  of  Berlekamp’s  algorithm  run  in  time  O(i),  where  the  “soft-Oh”  notation  hides 
polylog  terms.  The  fastest  known  algorithm,  due  to  Pan  [13],  runs  in  time  O(£log£loglog£). 

Our  scheme  also  supports  two  types  of  black  box  tracing  techniques,  which  we  describe  in  the  next 
two  sections. 

2If  B  is  in  canonical  form  with  the  identity  matrix  as  its  first  2 fc  rows,  then  v  =  (dj|0 . .  .0)  suffices. 
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4  Minimal  Access  Black  box  tracing  against  arbitrary  pirates 


In  this  section,  we  show  a  minimal  access  black  box  tracing  algorithm  for  our  basic  encryption  scheme 
that  works  against  arbitrary  pirates.  To  achieve  this  we  introduce  an  easier  tracing  goal  called  black 
box  confirmation. 

4.1  Black  Box  Confirmation 

When  the  police  finds  a  pirate  decoder  they  often  have  a  suspect  pirate  in  mind.  The  goal  of  black 
box  confirmation  is  to  allow  the  police  to  confirm  their  suspicion.  A  black  box  confirmation  algorithm 
is  HpfinpH  as  follows:  the  algorithm  is  given  (1)  the  public  key  and  all  random  bits  used  during  initial 
key  generation,  (2)  black  box  access  to  a  pirate  decoder  23,  and  (3)  a  set  TSUspect  of  private  keys 
suspected  of  creating  23.  The  black  box  confirmation  algorithm  outputs  either  “not-guilty”  or  “key 
d  is  guilty”  for  some  d  €  Tsuspect.  Let  T©  be  the  set  of  private  keys  in  the  pirate’s  possession  when 
23  was  created.  The  output  must  satisfy  the  following  two  requirements: 

1.  Confirmation:  If  T©  C  Tsuapect  then  the  confirmation  algorithm  must  pronounce  at  least  one 

key  d  €  Tsuspect  as  guilty. 

2.  Soundness:  If  the  confirmation  algorithm  outputs  key  d  is  guilty  then  d  £  T©. 

As  always  we  assume  that  the  size  of  both  Tsuspect  and  T©  is  less  than  the  collusion  bound  k.  The 
police  does  not  know  ahead  of  time  whether  T©  C  Tauapect-  Hence,  condition  (2)  ensures  that  when 
T©  g  Tauapect  the  pirate  cannot  fool  the  algorithm  into  accusing  an  innocent  user.  Note  that  when  the 
police  has  a  small  number  of  suspect  sets  Tauapect, . . .  ,T^uapect  they  can  easily  find  a  guilty  key  d  6  T© 
using  black  box  confirmation. 

A  black  box  confirmation  algorithm  will  imply  a  black  box  tracing  algorithm  as  follows.  Run  the 
confirmation  algorithm  on  all  candidate  coalitions  (£  is  the  total  number  of  users  in  the  system). 
By  the  confirmation  property  (Property  1)  when  the  tested  coalition  contains  the  guilty  coalition, 
some  member  of  the  guilty  coalition  will  be  pronounced  guilty.  A  guilty  user  might  be  caught  when 
other  coaltions  are  tested.  By  the  soundess  property  (property  2)  whenever  a  key  d  is  pronouned 
guilty  the  key  is  a  member  of  the  guilty  coaltion  with  very  high  probability.  Thus  we  obtain  a  black 
box  tracing  algorithm  with  a  running  time  of  0((£)fc2).  This  is  not  an  efficient  tracing  algorithm,  but 
it  shows  that  black  box  tracing  is  possible  in  principle.  Hence,  we  do  not  have  to  rely  on  the  pirate  to 
embed  specific  information  in  the  pirate  decocer.  In  the  next  section  we  show  an  efficient  black  box 
tracing  algorithm  against  a  restricted  set  of  pirates. 

We  briefly  outline  the  basic  idea  behind  our  black  box  confirmation  algorithm.  Let  Tauapect  = 
{4,...,  4}  and  let  g  be  a  generator  of  Gq.  Let  (y,  hi,...,  h2k)  be  a  public  key.  Toconfirm  its  suspicion 
of  Tauapect  the  tracer  queries  the  pirate  decoder  with  a  random  invalid  ciphertext  C  =  (S,gZl,...,  gZ2k ) , 
where  the  vector  z  satisfies  z  ■  4  =  w  f°r  all  i  €  Tauapect-  Here  w  is  a  random  element  of  F?.  This 
ciphertext  is  invalid  since  (gZl,...,gZ2k)  is  most  likely  not  of  the  form  (h\, . . . ,  h^)  for  any  r.  We 
show  that  when  T©  C  Tauapect  the  pirate  decoder  cannot  distinguish  this  invalid  ciphertext  from  a  real 
one.  Consequently,  we  show  that  it  must  respond  with  A  =  S/g where  d  is  some  representation  of 
y  in  the  convex  hull  of  d\, ...  ,dk.  Hence,  when  T©  C  Tauapect  the  pirate  decoder  always  responds  with 
A  =  S/gw.  When  T©  n  Tauapect  =  0  the  decoder’s  view  is  independent  of  S/gw.  Hence,  the  tracer  can 
tell  if  Tauapect  contains  one  of  the  pirate’s  keys.  With  a  bit  more  work  the  tracer  can  then  point  to  a 
specific  d  e  Tauapect  that  must  be  in  the  pirate’s  possession. 
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4.2  Definitions  and  Distributions 


Before  we  present  the  black  box  confirmation  algorithm  we  need  to  define  a  few  terms.  For  a  given 
suspect  set  Ttvapect  =  {di, . . .  ,dk}  C  define  the  following  terms: 

•  For  t  =  0, . . . ,  k  define  T)  =  {di , . . . ,  di}  C  T,uspect.  Tq  is  the  empty  set. 

•  For  i  =  0, . . . ,  k,  and  W  G  Gq  define  CTi(W)  as  the  set: 

f  2*  i 

CTj(lF)  =  <  C  =  (S, H\, . . . , H2k)  S  G  Gq  and  W  =  JJ Hff '  for  all  d=  (Si,..., 62k)  G  T,  1 

k  «=i  i 

Define  CT<=  (J  CT t(W). 

W€Gq 

Note  that  when  key  d  G  T<  is  used  to  decrypt  C  €  CTj(W)  we  get  S/W. 

•  For  t  =  0, . . . ,  k  define  the  distribution  CW<  on  pairs  (W,  C)  as  follows:  pick  a  random  W  £  Gq, 
pick  a  random  C  G  CTj(W),  output  (1 W \  C). 

•  For  t  =  0,...,A:  define  PVti  as:  PVi  =  Pr  [D(C, S/W)  =  validl. 

(w,c)ec\Ni  1  ’  J 

Where  C  =  (S,  Hu . . . ,  H2k).  Here  (W,  C)  is  chosen  from  the  distribution  ON,  and  V(C,  S/W) 
denotes  the  success  or  failure  of  the  pirate  decoder  on  input  C,  S/W  (minimal  access). 

•  Given  a  public  key  (y,hu...,h2 k)  we  refer  to  the  set  {(S.hJ,...,^)  :  r  €  F,}  as  the  set  of 
valid  ciphertexts.  We  refer  to  ciphertexts  outside  this  set  as  invalid  ciphertexts. 

Define  the  distribution  ONvalid  on  pairs  (W,  C)  as  follows:  (1)  pick  a  random  r  G  F9  and  S  G  Gq, 
(2)  set  W  =  yT,  and  C  =  (S,h\,..., h^),  (3)  output  (W, C).  Note  that  C  is  a  valid  ciphertext 
and  its  decryption  is  S/W. 

Note  that  for  any  i  we  can  efficiently  sample  from  the  distribution  CW<  as  follows:  (1)  pick  a 
random  w  G  Ffl  and  set  W  =  gw,  (2)  pick  a  random  vector  z  =  (z\, . . . ,  z2k)  G  such  that  z-d  =  w 
for  all  d  G  T{,  (3)  pick  a  random  5  €  Gq,  (4)  set  C  =  (S,  gZl , . . . ,  gz™ ),  and  output  (W,  C). 

We  claim  that  PVi0  <  1  jq.  To  see  this  observe  that  when  (W,  C)  is  chosen  from  CWo  we  have  that 
W  is  independent  of  C.  It  follows  that  V(C ,  S/W)  is  valid  with  probability  at  most  l/q. 

4.3  Description  of  Black  Box  Confirmation  Algorithm 

We  are  now  ready  to  describe  the  black  box  confirmation  algorithm.  An  outline  of  the  algorithm  is 
given  in  [4].  The  algorithm  is  given  minimal  access  to  V  and  works  as  follows: 

Algorithm  1: 

Step  1:  Compute  estimates  for  every  PV  i. 

For  each  i  =  0, . . . ,  k  do: 

Step  la:  For  j  =  1, . . . ,  A  pick  random  and  independent  pairs  (Wj,  Cj)  chosen  according  to  the 
distribution  CW^.  The  value  of  A  will  be  determined  later. 

Step  lb:  Run  V  on  all  inputs  Cj, . . . ,  C\  and  let  c*  be  the  number  of  times  that  ^(C^,  Sj  /Wj)  = 
valid. 

Step  1c:  Set  pi  =  Cj/A. 
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Step  2s  If  pfc  <3/4  output  not-guilty  and  stop. 

Step  3:  Otherwise,  since  PVfi  <  1/q  it  follows  that  pa  <  1/4  with  high  probability  (we  calculate  exact 
probabilities  below).  Then: 


1 

2 


<  Ip* -pol  = 


k  k 

i=l  *=1 


It  follows  that  there  exists  a  1  <  j  <  k  such  that:  \pj  -  Pj~i\  > 
Output  dj  e  Tsuspect  is  guilty,  and  stop. 


The  completes  the  description  of  the  black  box  confirmation  algorithm.  The  algorithm  requires 
\(h  +  1)  queries  to  the  pirate  decoder  T).  As  we  show  in  Lemma  4.6,  we  take  A  on  the  order  of  fclog  k 
and  hence  the  algorithm  has  running  time  0(k?  log  k).  Note  that  Algorithm  1  does  not  need  the  public 
key  or  the  private  bits  used  during  key  generation,  only  the  k  private  keys  of  the  suspect  coalition. 

In  the  rest  of  this  section,  we  will  prove  the  following  theorem. 

Theorem  4.1  Algorithm  1  is  a  black  box  confirmation  algorithm. 

To  prove  Theorem  4.1,  we  must  show  that  Algorithm  1  satisfies  the  confirmation  property  (Property 
1)  and  the  soundness  property  (Property  2)  for  black  box  confirmation.  These  will  be  proven  in 
Lemma  1  and  Lemma  4.6  respectively.  The  probability  space  in  both  lemmas  is  over  the  random  bits 
used  by  Algorithm  1,  the  random  bits  used  by  the  pirate  and  pirate  decoder,  and  the  random  set  of 
k  private  keys  given  to  the  pirate. 


4.4  Algorithm  1  Satisfies  the  Confirmation  Property 

We  show  that  Algorithm  1  satisfies  the  confirmation  property,  i.e.,  no  pirate  can  fool  Algorithm  1 
when  Tv  C  Tsuspect.  To  do  so,  we  show  that  the  following  non-traceable  pirate  does  not  exist. 

Non-traceable  pirate:  Let  V  be  some  pirate.  Let  . . .  ,/&2fc)  be  a  public  key  and  let  Tp  = 

4}  be  a  random  set  of  k  private  keys.  We  say  that  the  pirate  V  is  e-non-traceable  if  given  the 
public  key  and  Tp,  the  pirate  creates  a  pirate  decoder  V  with  the  following  properties:  (1)  V  correctly 
decrypts  all  valid  ciphertexts,  and  (2)  when  T  C  Tsuspect  the  pirafedecoder  l?  causes  Algorithm  1 
to  output  “not-guilty”  with  proability  at  least  e.  The  following  Mma  'shows  that  a  non-traceable 
pirate  does  not  exist.  Recall  that  u  is  the  number  of  samples  used  in  Step  1  of  Algorithm  1. 

Lemma  4.2  Let  e>  0  and  let  A  =  1.  Suppose  V  is  an  e-non-traceable  pirate.  Then  V  can  be  used  to 
distinshuish  DH  tuples  from  random  tuples  with  advantage  e. 

Proof  Suppose  Tv  C  Tsuspect-  Then,  by  assumption,  when  Algorithm  1  is  given  a  decoder  V 
created  by  P,  Algorithm  1  outputs  not-guilty  with  probability  at  least  e.  This  means  pk  <  3/ 4 
with  probability  at  least  6.  Since  A  =  1  it  follows  that  PVik  <  1  -  e.  This  means  that  the  pirate  V  is 
able  to  build  a  pirate  decoder  V  that  correctly  decrypts  all  valid  ciphertexts;  however  for  a  random 
C  €  CTfc(W)  it  outputs  “invalid”  on  input  C,S/W  with  probability  at  least  e. 

We  show  how  the  pirate  V  can  be  used  to  solve  DDH  in  Gq.  Given  a  challenge  tuple  (yi,  <721  wi, ^2) 
we  use  the  following  algorithm  to  decide  whether  it  is  a  random  tuple  or  a  Diffie-Hellman  tuple.  We 
write  #2  =  9i  fc>r  some  unknown  c. 

Algorithm  A: 
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Step  1:  Choose  a  random  set  of  k  codewords  in  I\  Denote  these  by  7W, . . .  ,7^)  e  (These  will 
correspond  to  the  k  private  keys  given  to  the  pirate.) 

Step  2:  Pick  a  random  A  E  such  that  g^g<i  ^  1. 

Step  3:  Pick  random  vectors  a,b€^k  such  that  the  vector  a  -  46  is  orthogonal  to  the  k  vectors 
7(1) ,  •  •  • ,  7(fc)  ■  Write  5  =  (<n , . . . ,  a2k)  and  6  =  (bj , . . . ,  62*)- 
Step  4:  Set  hi  =  g'fgfy  for  i  =  1, . . . ,  2k. 

Step  5:  Pick  a  random  vector  a  €  such  that  a  is  orthogonal  to  6  —  Ab.  Write  fi  =  (qj,  . . . ,  a2k). 
Step  6:  Set  y  =  •  •  •  h%>.  Note  that  y  =  g«  *gf  =  gf&l)gfb  =  (g?g2)&\ 


Step  7:  For  i  =  1, . . . , k  set  0j  =  °  ^  Set  d;  = 

j\*)  •  b 

Observe  that  di,...,dk  are  representations  of  y  to  the  base  hu...,h2k.  Indeed,  for  d{  = 
(Su---,S2k)  we  have: 


m= 


„a'di  i  ii  T 

ffi  92  =  [: 


3*7(0  5*7(0 
9i 


r-[« 


yt(5*7(0)  (i*7«))1«i  _ 


(9f92)6S  =  y 


Step  8:  Run  the  pirate  V  giving  it  the  public  key  ( y,hi,...,h2k )  and  the  k  private  keys  0h...,0k. 
The  pirate  builds  a  pirate  decoder  T>. 

Step  9:  Pick  a  random  S  €  Gq  and  construct  the  ciphertext:  C  =  (S,  u\a' u2bl ,  . . . ,  uia2ku22k) 
Let  C  =  H2k)  and  set  W  =  f] if®* 

Step  10:  Query  the  pirate  decoder  V  on  the  ciphertext/message  pair  (C,  S/W).  If  V  returns  “valid” 
output  that  (gi,(?2, ui, U2)  is  a  Diffie-Hellman  tuple.  Otherwise,  output  that  (ffi,y2,ui,ti2)  is  a 
random  tuple. 

This  concludes  the  description  of  algorithm  A  for  solving  DDH.  We  show  that  algorithm  A  above 
correctly  decides  whether  the  given  tuple  (gi,g2,uu  u2)  is  a  DDH  tuple.  This  follows  from  the  following 
three  claims. 

Claim  4.3  The  distribution  on  the  public  key  {y,hi,..., h2k)  and  the  k  private  keys  6u...,9k  given 
to  the  pirate  V  in  Step  8  is  identical  to  the  distribution  generated  by  the  real  key  generation  algorithm. 

Proof  Consider  the  key  generation  algorithm  of  Section  3.  Observe  that  given  hu...,h2k  and  y  the 
values  0i,...,  0*  are  determined.  Indeed,  given  the  values  y,hu...,h2k  the  vector  n, . . . ,  r2k  e  F,  and 
Ei=i  riai  are  determined.  Then  the  0*  are  determined  from  equation  (1).  Hence,  when  using  the  real 
key  generation  algorithm  the  probability  of  seeing  a  specific  valid  sequence  (y,  hu . . . ,  h2k,  0h . . . ,  0k) 
is: 


Pr[(y,/ii,...,/i2k,0i,...,0*)]  =  l/g2fc+1 

where  the  probability  is  over  the  random  bits  used  by  the  key  generation  algorithm. 

We  show  that  the  same  holds  for  the  public  key  given  to  the  pirate  in  Step  8.  Let  r0  be  the  kx2k 
matrixwhose  rows  are  the  vectors  _7(1),*.*,7(fc)  6  Pqk.  This  matrix  has  rank  fc.  We  first  show  that 
given  fmTfc  linear  constraints  on  a,  b  in  Step  3,  the  vector  (hi, . . .  /i2fc)  is  uniformly  distributed  in  G^k. 
Since  hi  =  g*‘gbi  =  g°i+cbi  it  suffices  to  argue  that  given  the  k  constraints  in  Step  3  the  2k  values 
ai  +  cbi  are  uniformly  and  independently  distributed  in  Fqk.  To  do  so  it  suffices  to  prove  that  the 
set  of  3 k  relations  (a  -  Ab)  •  7M  =  0  for  i  =  1, . . . ,  fc  and  a,  +  cbj  =  0  for  j  =  1, . . . ,  2k  are  linearly 
independent.  In  other  words,  we  need  to  show  that  the  rows  of  the  following  3 k  x  4Jfc  matrix  are 
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linearly  independent: 


»-(£  '«£') 

Here  hk  denotes  the  2k  x  2k  unit  matrix.  One  can  immediately  verify  that  whenever  c  ^  -A  the 
matrix  has  rank  3fc  as  required.  We  know  that  c  ^  —A  since  in  Step  2  we  ensure  that  y^y2  ^  1* 
This  shows  that  the  vector  (hi, . . . ,  h2k)  is  uniformly  distributed  in  Gf. 

Next,  we  show  that  y  is  uniformly  distributed  in  Gq  and  is  independent  of  (hi, , h2k).  This  is 
immediate  since  y  =  (y^)^  and  a  •  b  is  uniform  in  Fq.  To  see  that  a  ■  b  is  uniform  (over  the 
choice  of  a)  recall  that  in  Step  5  a  is  generated  as  a  random  vector  in  Ff  subject  to  one  constraint 
a .  (a  -  Ab)  =  0.  This  constraint  on  a  is  linearly  independent  of  the  constraint  a  •  b  =  0.  Hence,  since 
a  «  b  is  independent  of  (hi, ,  h2fc),  so  is  y. 

Finally,  we  show  that  given  y,  hi, . . . ,  h2fc  the  value  of  0i, . . . ,  0k  is  determined.  Hence,  revealing 

. . .  9  ek  does  not  further  constrain  a,  6,  a.  To  see  this  observe  that  since  0*7^  is  a  representation  of 
y  to  the  base  hi, ... ,  h2k  we  have  that: 

(a  -4  cb)  -  a  {A  +  c)  ■  (6  ■  a) 

*  (a  4-  cb)  •  7M  (a  4-  cb)  •  7W 


But  (A  +  c)(b  •  a)  is  completely  determined  by  y,  and  the  vector  a  4  cb  is  completely  determined  by 
hi, ... ,  h2fc.  Thus,  given  y,  hi, . . . ,  h2k  the  value  of  0i, . . . , 0k  is  fixed. 

To  summarize,  the  probability  that  in  Step  8  the  pirate  is  given  the  input  (y,  hi, . . . ,  h2fc,  0i, . . . ,  0fc) 
is  1  /q2k+l  as  required.  Overall,  we  have  shown  that  once  (y,  hi, ... ,  h2k,  0i, . .  * ,  0fc)  is  revealed  there 
are  3 k  relations  induced  on  a,  b  and  two  relations  induced  on  a  (one  by  y  the  other  by  Step  5).  As  a 
result,  (a,  b)  is  random  in  a  linear  space  of  dimension  k  and  a  is  random  in  a  linear  space  of  dimension 
2k  -  2.  This  will  be  used  in  the  next  two  claims.  E 


Claim  4.4  Suppose  (yi,y2,ui,it2)  is  a  random  Diffie-Hellman  tuple.  Then  given  the  public  key 
(y,hi,...,h2k)  and  the  k  private  keys  0i,...,0fc  the  pair  {W,C)  generated  in  Step  9  is  distributed 
according  to  C\Nvati<t . 

Proof  When  {gug2i,ui)U2)  is  si  Diffie-Hellman  tuple  we  have  that  u\  —  gf  and  u2  —  y2  for  some 
random  unknown  x.  It  follows  that  C  generated  in  Step  9  satisfies: 

C  =  (S,  (g?  gb2'  (gTk9b22k  )*>  =  (S,  h2k) 

Since  x  and  S  are  random  and  independent  of  (y,hi, . . . ,  h2fc)  this  is  a  random  valid  ciphertext.  Fur¬ 
thermore  W  =  uf‘aul’a  =  {gi92)x{l'&)  =  yx  as  required.  □ 


Claim  4.5  Suppose  (yi,y2,ui,  w2)  is  a  random  tuple.  Then  given  the  public  key  (y, hi,...,h2fc)  and 
the  k  private  keys  0i, ...  ,0k  the  pair  (W,C)  generated  in  Step  9  is  distributed  according  to  ONk. 

Proof  We  write  y2  =  g\  for  some  unknown  c.  We  also  let  u\  —  gXl  and  u2  =  g^2,  and  assume  that 
xx^x2.  We  need  to  show  that  the  pair  (W,  C)  generated  in  Step  9  satisfies:  (1)  W  is  uniform  in  Gq 
and  independent  of  7  =  (y,  hi, . . . ,  h2k,  0i, . . . , 0fc),  and  (2)  C  is  uniform  in  CTk(W)  and  independent 
of  (7,W). 

We  start  with  W.  Define  w  =  (Aa;i  4-  cx2){ol  ■  6).  Then: 


„  (Axi+cx2)(b-a)  _  w 
ul  u2  —  01  “Sr 
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To  see  that  given  I  the  value  W  is  uniform  in  Gq  it  suffices  to  show  that  w  is  uniform  in  F,.  This  is 
immediate  since  a  •  6  ^  0  (since  y  ^  1)  and  Xi,X2  are  independent  of  I. 

Next,  we  show  that  given  I,W  the  ciphertext  C  is  uniform  in  CT*(W).  First,  observe  that  C 

generated  in  Step_9  satifies  C  E  CT*(W).  To  see  this  write  C  =  {S,HU..., H2k).  Then  Hi  =  u^u^, 

and  thus  for  any  di  =  =  (5i , . . . ,  S2k)  we  have: 


2k 

IJ-H?  = 

t=l 


=  gf  =  W 


Therefore  C  E  CT*(W).  Next,  we  show  that  given  I,W  the  vector  C  is  uniform  in  CTfc(tF)  over 
the  choice  of  5,  6.  To  do  so  we  show  that  if  C  =  (S,  g? , . . . ,  gz*)  then  z  =  (zu . . . ,  z2k)  is  unform 
in  a  linear  space  of  dimension  k.  Since  CTj^PF)  is  associated  to  a  linear  space  of  dimension  k  and 
C  €  CTfc(VF)  it  will  follow  that  C  spans  all  of  CT ^ (VF)  over  the  choice  of  5,6  and  is  thus  uniform  in 
it. 

Let  C  =  (5,  gZl , . . .  ,g{n).  Then  by  definition  of  C  we  know  that  z  =  x\ 5  +  cx2b.  Therefore,  it 
suffices  to  show  that  given  I,  W  the  vector  xja  +  cx2b  spans  a  space  of  dimension  k  over  the  choice 
of  5, 6.  In  Claim  4.3  we  showed  that  I  =  (y,h\,. . . ,  h2k,6i, . . . , 9k)  induces  3fc  linear  relations  on  the 
pair  (5,6).  Let  fl  be  the  matrix  as  in  Claim  4.3.  We  need  to  show  that  given  that  Q  •  [5||6]  =  t,  for 
some  fixed  t  E  1F^* ,  the  vector  xj5  +  cx2b  spans  a  space  of  dimension  k  over  the  choice  of  5, 6.  This 
amounts  to  showing  that  the  following  5k  x  4k  matrix  has  full  rank  (i.e.  has  rank  4k): 

/  To  -AT0  \ 

I  hk  chk  I 
\  Z\hk  CX2hk  / 

This  is  immediate  since  when  xi  ^  x2  the  two  bottom  parts  of  the  matrix  already  have  rank  4k.  This 
shows  that  given  I ,  W  the  ciphertext  C  comes  from  a  linear  space  of  dimension  k.  Hence,  C  is  unifrom 
in  CT*(W)  over  the  choice  of  6, 6.  This  concludes  the  proof  that  given  /  the  pair  (W,  C)  is  distributed 
according  to  CWfc.  □ 


Proof  of  Lemma  4.2:  Claim  4.3  shows  that  in  Step  8  of  Algorithm  A  the  pirate  V  is  given  input 
from  a  distribution  that  is  identical  to  the  distribution  in  the  real  attack.  Consequently,  it  generates 
a  pirate  decoder  V  with  the  following  properties.  Let  C  =  (S,HU... ,H2k)  for  some  Hu...,H2k, 
let  W  =  Y\i=1Hli  \  and  let  M  =  S/W.  When  C  is  a  valid  ciphertext  then,  by  definition,  M  is  the 
decryption  of  C.  Given  the  pair  (C,  M)  the  decoder  V  satisfies:.  (1)  if  (W,  C)  is  distributed  according 
to  then  the  decoder  returns  “valid",  and  (2)  if  ( W,C )  is  distributed  according  to  CW^  it 

returns  “invalid”  with  probability  at  least  e.  By  Claims  4.4  and  4.5  the  pair  {W,C)  constructed  in 
Step  9  falls  into  cases  (1)  or  (2)  depending  on  whether  the  given  tuple  {gi,g2,ui,u2)  is  a  Diffie-Hellman 
tuple.  Hence,  when  the  algorithm  outputs  its  decision  in  Step  10  we  have  that: 


,S“*)  ,  >s"]  -  P,  =  "yes"] 


>  € 


This  concludes  the  proof  of  Lemma  4.2. 


□ 
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4.5  Algorithm  1  Satisfies  the  Soundness  Property 

The  next  lemma,  shows  that  Algorithm  1  satisfies  the  soundness  property  (property  2).  Recall  that  A 
is  the  number  of  samples  we  make  in  Step  1  of  Algorithm  1. 

Lemma  4.6  Let  *D  be  a  pirate  decoder  built  using  a  set  Tp  of  at  most  k  private  keys.  Let  Tsuspect  ^ 
a  suspect  set  of  size  at  most  k.  Let  e  >  0  and  set  A  =  64fc  log  Then  when  Algorithm  1  interacts  with 
V  and  outputs  “key  d  is  guilty”  then  d€Tp  with  probability  at  least  1  -  e. 

Proof  Since  A  =  64fc  log  *  we  know,  by  the  Chernoff  bound,  that  for  all  i  =  0, . . . ,  k  we  have 
|p«  -  Pc.il  <  l/8Jfe  with  proability  at  least  1  -  e.  Therefore,  if  key  dj  €  Tsuspect  is  pronounced  guilty 
then  with  probability  at  least  1  —  e  we  have 

\Pv.i  -  Pvj-i\  >  ^ 

This  means  that  the  pirate  decoder  is  able  to  distinguish  between  the  distributions  CWj  and  CW^i 
with  advantage  at  least  The  following  lemma  shows  that  if  dj  &  Tv  then  this  is  not  possible  unless 
DDH  in  Gq  cannot  be  solved  with  advantage  1/4&2.  This  will  immediately  prove  Lemma  4.6 

Lemma  4.7  Let  (y,  be  a  public  key.  Let  TSUSpect  =  be  the  set  of  k  private 

keys  used  to  define  the  sets  CT*  for  i  =  0, . . . ,  k.  Suppose  V  is  a  pirate  that  given  the  public  key  and 
a  random  set  of  k  private  keys  Tv  is  able  to  construct  a  decoder  V  with  the  following  property: 

3 jo  €  {1,  •  • . ,  k}  :  dj0  £  T?>  and  | Pv,j0  -  Pvj0-i\  >  c 

Then  the  pirate  V  can  be  used  to  solve  DDH  in  Gq  with  advantage  at  least  e/k . 

Proof  Since  | PVijQ  —  Pt>j0-i\  >  e  we  know  that  the  pirate  is  able  to  build  a  decoder  V  that  can 
distinguish  CWjo  from  CWjo_!  with  advantage  e,  without  knowing  dj.  We  show  how  to  use  such  a 
pirate  to  solve  DDH  in  Gq  with  advantage  e/k.  Given  a  challenge  tuple  (01,02,  ^1,^2)  we  use  the 
following  algorithm  to  decide  whether  it  is  a  random  tuple  or  a  Diffie-Hellman  tuple.  We  write  02  =  01 
for  some  unknown  c. 

Algorithm  B : 

Step  0:  Pick  a  random  jo  €  {1, . . . ,  k}. 

Step  1:  Choose  a  random  set  of  k  codewords  in  F.  Denote  these  by  7^ , . . .  ,7^  €  3F2fe.  (These  will 

correspond  to  the  k  private  keys  given  to  the  pirate.)  Also,  let  7^, . . .  ,7^  be  an  arbitrary  set 
of  k  codewords  in  T.  These  will  correspond  to  keys  that  belong  to  the  suspect  coalition.  We 

assume  7^  £  {7^, . . .  ,7^}. 

Step  2:  Pick  a  random  such  that  0^02  ^  1. 

Step  3:  Pick  random  vectors  G  ^  such  that  the  vector  a  -  Ab  is  orthogonal  to  all  vectors 
(7(1)  j . . .  ?  ryW ,  syW , . . .  ,  ryifa-1) ) .  if  the  chosen  a,  b  satisifes  that  a  -  Ab  is  orthogonal  to  7^  then 
repeat  Step  3.  Write  a  —  (ai, . . . , a2fc)  and  b  =  (61, . . . ,  &2fc)* 

Step  4:  Set  hi  =  0^0^  for  i  =  1, . . . ,  2k. 

Pick  a  random  vector  v  G  such  that  v  is  orthogonal  to  all  7^, . . . , 7^. 

Let  (ui,...,v2fe). 

Step  5:  Pick  a  random  vector  a  6  such  that  a  is  orthogonal  to  d  -  Ab  and  is  orthogonal  to  v . 
Write  a  =  (07, . . . ,  a2fc)- 

Step  6:  Set  y  =  ft?1  •  •  •  Then  y  =  gfagfb  =  g?{&'b)gfb  =  {gfgif1- 
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Step  7:  For  i  =  l,...,k  set  0{  =  Set  d{  =  and  set  Tp  =  {dx, . . . , dk}. 

For  i  =  1, . . .  ,j0  -  1  set  0/  =  •  Set  d\  = 

7»  ’J> 

As  in  Algorithm  A  we  know  that  d\, . . . ,  dk  and  d*, . . . ,  dj0_ ,  are  representations  of  y  to  the  base 
hi,...,  h2k. 

Step  8:  Run  the  pirate  V  giving  it  the  public  key  (y,  hi, . . . ,  h2k)  and  the  k  private  keys  9i,...,0k. 
The  pirate  builds  a  pirate  decoder  V. 

Step  9:  Pick  a  random  5  €  Gq  and  construct  the  ciphertext:  C  =  (S,  gl1Uiaiu2bl,  ...,  gl2kuia2ku^k) 
Let  C  =  (S,  H\, . . .  ,H2k)  and  set  W  =  Ili=i  -Hf*.  ’  ‘ 

Step  10:  Query  the  pirate  decoder  V  on  the  ciphertext /message  pair  {C,  S/W).  If  V  returns 
“valid”  output  that  {g\,g2,u\,u2)  is  a  Diffie-Hellman  tuple.  Otherwise,  output  that  (g\,g2,u\,u2) 
is  a  random  tuple. 

This  concludes  the  description  of  algorithm  B  for  solving  DDH.  We  show  that  algorithm  B  above 
correctly  decides  whether  the  given  tuple  (gu  g2,  uuu2)  is  a  DDH  tuple.  This  follows  from  the  following 
three  claims. 

Claim  4.8  The  distribution  on  the  public  key  (y,fci,...,fc2*)  and  the  k  private  keys  9tven 

to  the  pirate  V  in  Step  8  is  identical  to  the  distribution  generated  by  the  real  key  generation  algorithm. 

Proof  The  proof  is  identical  to  the  proof  of  Claim  4.3  with  minor  changes.  The  only  difference  is 
that  the  pair  {a,  b)  is  subject  to  k  4-  jo  —  1  constraints  in  Step  3.  Hence,  given  the  public  key  {a,  6)  sits 
in  a  space  of  dimension  fc  -  jo  +  1.  □ 

Let  (y,  hi, ,  h2k)  be  the  public  key  generated  in  Step  6.  For  i  =  1, . . . ,  k  define  B\  e  Fp  as  the 
value  that  makes  0^ ys  be  a  representation  of  y  base  hi, , h2k.  For  t  =  —  1  the  value  Of  is 

computed  in  Step  7,  but  for  i  =  jo>  •  •  • ,  k  the  value  Of  is  unknown  to  Algorithm  B.  For  i  =  1, . . . ,  k 
define  d\  =  Of 7^.  As  above,  df  is  unknown  to  Algorithm  B  for  i  =  j0, . . .  ,j. 

Claim  4.9  Let  Tauspect  =  {dj, . . . ,  dsk}  be  the  suspect  set  used  to  define  the  sets  CTq,  . . . ,  CT*.  Suppose 
(9u92,uuu2)  is  a  random  Diffie-Hellman  tuple.  As  in  Step  1  we  assume  dsjo  £  Tv.  Then  given  the 

public  key  (y,fti, . . .  ,h2k)  and  the  k  private  keys  0\  the  pair  (W,  C)  generated  in  Step  9  is 

distributed  according  to  CWj0 . 

Proof  We  write  g2  =  g\  for  some  unknown  c.  We  also  let  ux  =  gf  and  u2  =  yf  for  some  unknown 
x.  We  need  to  show  that  the  pair  { W, ,  C)  generated  in  Step  9  satisfies:  (1)  W  is  uniform  in  Gq  and 
=  •  •  *  ’ 9u •  •  *  I®*),  and  (2)  C  is  uniform  in  CTj0(W)  and  independent  of 

We  start  with  W .  Define  w  —  x(A  4-  c)(6  *  a).  Then,  since  a  is  orthogonal  to  v  we  have: 

2  k 

w  =  =  ufs)ul*  =  gfA+C^  =  gf 

1=1 

To  see  that  given  I  the  value  W  is  uniform  in  Gq  it  suffices  to  show  that  w  is  uniform  in  F,.  This  is 
immediate  since  a-b^0  (since  y  /  1),  and  A  +  c  0  (since  gfg2  f  1),  and  a:  is  independent  of  I. 

Next,  we  show  that  given  I,W  the  ciphertext  C  is  uniform  in  CTjo(W).  First,  observe  that  C 
generated  in  Step  9  satifies  C  6  CTio(w).  To  see  this  write  C  =  (5,  Hu. .  .,H2k).  Then  H{  =  g^ufu^. 
Recall  that  v  is  chosen  so  that  it  is  orthogonal  to  any  df  for  i  =  1, ... ,  j0.  Therefore,  for  any  i  =  1, ... ,  j0 
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and  df  =  (<$i, . . . ,  62 k)  we  have: 


ns/* = fjpJp  =  u*  Af  rf’>r  =  [r"”T  =  [A^T = [«riKW]‘  - «? = ' 

t=l  L  J  L  J 


Therefore  C  E  CTj0(PT).  Next,  we  show  that  given  7,  W  the  vector  C  is  uniform  in  CTj0(W)  over  the 
choice  of  v. 

We  first  show  that  v  is  independent  of  7.  Clearly  v  is  independent  of  hi, . . . ,  hzk-  To  show  that  it 
is  independent  of  y  we  show  that  v  is  independent  of  b  •  a.  That  is, 


Pr  [v  =  uo  I  « •  vo  =  0,  a  •  b  =  J,  a  •  (a  —  Ab)  =  0]  =  Pr[n  =  vq ] 


For  any  v0  that  is  orthogonal  to  all  7^,. . .  ,7?o)  and  i  E  ¥q.  This  follows  from  the  fact  that  v0  is 
linearly  independent  of  b  since  b  •  7^  ^  0,  and  “Do  is  linearly  independent  of  a  —  Ab  since  since  a  —  Ab 
is  not  orthogonal  to  7^ 

Since  v  is  independent  of  7,  W,  and  v  is  imiform  in  a  space  of  dimension  2 fc  -  jo,  we  know  that  C 
is  associated  to  a  linear  space  of  dimension  2k  —  jo  and  is  independent  of  7,  W.  But  2k  —  jo  is  also  the 
dimension  of  the  linear  space  associated  to  CTj0(VF).  Hence,  given  7,  W  the  ciphertext  C  is  uniform 
in  CTjo(W).  D 


Claim  4.10  Let  Tsuspect  =  {dj , . . . ,  dsk}  be  the  suspect  set  used  to_  define  the  sets  CT0, . . . ,  CT*.  Sup¬ 
pose  (yi,<72?ui,  U2)  is  a  random  tuple.  As  in  Step  1  we  assume  djQ  #  Tp.  Then  given  the  public  key 
(y,hi,... ,/i2fc)  and  the  k  private  keys  0i,...,0fc  the  pair  (W, C)  generated  in  Step  9  is  distributed 
according  to  CWJO_i. 

Proof  We  write  <72  =  9i  f°r  some  unknown  c.  We  also  let  u\  =  g*1  and  U2  —  g%2  f°r  some  unknown 
®i,  X2 .  We  assume  xi  ^  X2 •  We  need  to  show  that  the  pair  (W,  C)  generated  in  Step  9  satisfies:  (1)  W 
is  uniform  in  Gq  and  independent  of  7  =  (y,  fti, .  •  • ,  fyzfci  0i , . . . ,  0fc),  and  (2)  C  is  uniform  in  CTj0_i  (W) 
and  independent  of  (7,  W). 

We  start  with  W.  Define  w  =  (rriA  +  cx2){ot  •  &)■  Then  since  v  is  orthogonal  to  a: 

W  =  =  g^-%^  =  [9^9?]  ^  =  g? 

To  see  that  given  7  the  value  W  is  uniform  in  Gq  it  suffices  to  show  that  w  is  uniform  in  ¥q.  This  is 
immediate  since  a  ■  b  ^  0  (since  y  ^  1),  and  Ax\  4-  0x2  Is  independent  of  7. 

Next,  we  show  that  given  7,W  the  ciphertext  C  is  imiform  in  CT^0_i(Wr).  First,  observe  that 
C  generated  in  Step  9  satifies  C  E  CTj0_i(W).  To  see  this  write  C  =  {S,  i7i, . . . ,  7/2fc)-  Then 
Hi  =  g^u^u^ .  Recall  that  v  is  chosen  so  that  it  is  orthogonal  to  any  d£  for  %  =  1, . . . ,  jo-  Therefore, 
for  any  i  =  1, . . . ,  jo  —  1  and  df  =  (Su  ■  •  •  ,  <$2fc)  we  have: 


i= l  *- 


*!  (Hi0) 


Therefore  C  €  CTj0_!(W).  Next,  we  show  that  given  I,W  the  vector  C  is  uniform  in  CT,0_i(W). 
When  C  =  (S,  g^1 , . . . ,  gl2k)  we  know  that  z  =  v  +  x\a  +  cx^b.  Hence,  it  suffices  to  show  that  given 
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I,W  the  vector  z  is  uniform  in  a  space  of  dimension  2k  -  j0  + 1  over  the  choice  of  v  and  xj,  x2.  As  in 
Claim  4.9  we  know  that  v  is  independent  of  7,  W,  and  therefore  it  is  uniform  in  a  space  of  dimension 
2k  -  jo.  Hence,  it  suffices  to  show  that  given  7,  W  the  vector  xia  +  cx2b  is  uniform  in  a  space  of 
dimension  1  over  the  choice  of  xi,  x2  and  it  is  linearly  independent  of  the  space  of  dimension  2k  —  jo 
spanned  by  v.  First,  since  xj  ^  x2  we  know  that  xia  +  cx2b  is  independent  of  h\, . . . , h2k.  However, 
the  value  w  induces  one  affine  constraint  on  xj, x2  since  x\A  +  cx2  =  w/(a  ■  6).  Hence,  given  7,  W  the 
vector  Xio  +  cx2b  spans  an  affine  space  of  dimension  1  as  required.  Note  that  when  x\A  +  cx2  =  0 
the  space  xio  +  cx2b  is  a  linear  space  of  dimension  1.  We  show  that  this  space  is  not  contained  in  the 
space  spanned  by  v  by  showing  that  when  Ax\  +  cx2  =  0  the  vector  xio  +  cx2b  cannot  be  orthogonal 
to  7s°o): 


(xia  +  cx2b)  ■  7^'o)  =  xi(a  -  Ab)  ■  +  (cx2  +  Axi){b  •  7<jo))  =  xi(a  -  Ab)  •  7^ 

But  since  in  Step  3  we  make  sure  that  a  —  Ab  is  not  orthogonal  to  7^°^  we  obtain  the  desired  result. 
Hence,  given  7,  W  the  ciphertext  C  is  uniform  in  CTj0_i(W).  This  shows  that  given  7  the  pair  (W,  C) 
is  chosen  according  to  the  distribution  CWJ0_1.  □ 

Proof  of  Lemma  4.7:  By  Claim  4.8  the  pirate  is  given  input  sampled  from  a  distribution  that  is 
identical  to  the  distribution  in  the  real  attack.  Hence,  it  constructs  a  pirate  decoder  V  satisfying 
the  conditions  of  Lemma  4.7  for  some  jo  €  {1, . . . ,  k}.  Then  Claims  4.9  and  4.10  show  that  when  j0 
is  guessed  correctly  in  Step  0  the  pair  {W,  C)  generated  in  Step  9  is  distributed  according  to  CWJ0 
or  CWj0_i  depending  on  whether  (ffi,<?2,ui,u2)  is  a  Diffie-Hellman  tuple  or  a  random  tuple.  Hence, 
when  j0  is  guesses  correctly,  Algorithm  B  solves  DDH  in  Gq  with  advantage  e.  The  probability  that 
jo  is  guessed  correctly  is  1/k,  hence,  Algorithm  B  solves  DDH  in  Gq  with  advantage  e/k.  □ 


5  Full  Access  Black  box  tracing  against  single-key  pirates 

Suppose  a  pirate  obtains  the  private  keys  du...,dk.  A  natural  strategy  for  the  pirate  to  build  a 
pirate  decryption  box  is  to  form  a  new  random  representation  d  and  then  create  a  box  that  decrypts 
using  this  representation.  We  call  this  a  “single-key  pirate”  since  only  a  single  representation  of  y  is 
embedded  in  the  pirate  decoder.  We  show  an  efficient  full  access  black  box  tracing  algorithm  that 
works  against  a  single-key  pirate.  Note  that  when  a  single  user  attempts  to  construct  a  decoder  that 
cannot  be  traced  back  to  her  she  is  essentially  acting  as  a  single-key  pirate. 

Formally,  we  model  the  single-key  pirate  as  two  distinct  parties  (Vi,V2).  The  first  party  Vi,  called 
the  key-builder,  is  given  the  public  key,  and  k  random  private  keys  di, ... ,dk •  It  creates  a  new  key  d 
by  forming  a  random  convex  combination  of  the  given  private  keys.  The  key-builder  then  hands  d  to 
the  second  party  V2,  called  the  box- builder.  The  box-builder,  seeing  only  d  and  the  public  key,  is  free 
to  implement  the  pirate  decryption  box  however  it  wants. 

We  show  how  the  tracer  can  extract  the  representation  d  =  (Si,...,S2k)  from  a  pirate  decoder 
created  by  a  single-key  pirate.  Once  d  is  obtained  we  use  the  algorithm  of  the  Section  3.3  to  recover  the 
pirate  keys.  The  basic  idea  in  extracting  d  is  to  observe  the  decoder’s  behavior  on  invalid  ciphertexts, 
e.g.,  C  =  (S,hll,...,hl2kk),  where  the  non-constant  vector  z  is  chosen  by  the  tracer.  This  ciphertext 
is  invalid  since  the  hi's  are  raised  to  different  powers.  The  next  lemma  shows  that  the  pirate  decoder 
cannot  distinguish  invalid  ciphertexts  from  valid  ciphertexts  (assuming  the  difficulty  of  DDH  in  Gq). 
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Hence,  we  show  that  on  input  C=(S,HU...,  H2k)  it  must  respond  with  A  where 

A  =  S/  =  S/Y[h?« 

Non-traceable  pirates:  Let  (Pi,  P2)  be  a  single-key  pirate.  Let  (y,hi,...,  h2k)  be  a  public  key  and 
let  d  =  {<5i, . . .  ,S2k}  be  the  private  key  generated  by  the  key-builder  Pi  given  the  public  key  and  a 
random  set  of  k  private  keys.  By  assumption,  d  is  a  random  convex  combination  of  the  k  private  keys. 
We  say  that  the  single-key  pirate  (Pi,V2)  is  non-traceable  if  given  d  the  box  builder  P2  is  able  to 
create  a  pirate  decoder  V  with  the  following  properties:  (1)  V  correctly  decrypts  all  valid  ciphertexts, 
and  (2)  when  V  is  given  a  random  invalid  ciphertext  C  =  (S, Hi, . . . ,  H2k)  eGf+1  it  outputs  a  value 
different  from  S/Y[Hf{,  with  non-negligible  probability  (over  the  choice  of  C).  The  following  lemma 
shows  that  a  non-traceable  single-key  pirate  does  not  exist. 

Lemma  5.1  Suppose  (Pi,P2)  is  a  non-traceable  single-key  pirate.  Then  (Pi,P2)  can  be  used  to  solve 
DDH  in  Gq. 

Proof  Sketch  Given  a  challenge  tuple  (gi,g2,ui,u2)  we  decide  whether  it  is  a  random  tuple  or  a 
Diffie-Hellman  tuple  as  follows: 

Step  1:  Pick  a  random  subset  of  fc  vectors  7(1\. . .  C  T  and  create  a  random  linear  combination 
7  6  B®*  of  these  k  vectors.  The  vector  7  will  eventually  become  the  private  key  created  by  the 
key-builder  TV 

Step  2:  Pick  random  vectors  d,  b  €  such  that  (a  —  b)  •  7  =  0. 

Step  3:  Set  h,  =  g?g%  for  i  =  1, . . . ,  2ft. 

Step  4:  Pick  a  random  vector  a  €  $qk  such  that  a  •  (a  -  b)  =  0. 

Step  5:  Set  y  =  f] 

Step  6:  Set  6  =  ?j|. 

Set  d-Oj  -  (5i,...,S2k).  Observe  that  d  is  a  representation  of  y  base  hi,...,h2k. 

Step  7:  Run  the  box  builder  P2  giving  it  the  public  key  (y,h,...  ,h2k)  and  the  representation  d. 
The  box  builder  builds  a  pirate  decoder  V. 

Step  8:  Using  a  random  SsG,  construct  the  ciphertext:  C  =  (S,  u“lub2l,  ...,  ua2kvb2k). 

Let  C  =  (S,  Hi, . . . ,  H2k). 

Step  9:  Query  the  pirate  decoder  V  on  the  ciphertext  C.  If  V  returns  S /  Ili=i  HT  output  that 
(gug2,ui,u2)  is  a  Diffie-Hellman  tuple.  Otherwise,  output  (gi,g2,ui,u2)  is  a  random  tuple. 

Observe  that  the  input  I  =  (y,hi,...,h2k,d)  given  to  the  box-builer  P2  is  sampled  from  a  dis¬ 
tributed  identical  to  the  real  attack.  To  see  this  note  that  y,h\, . . .  ,h2k  are  uniformly  distributed  in 
Ggk.  Furthermore,  d  is  a  random  convex  combination  of  a  random  set  of  k  private  keys. 

Next,  note  that  when  the  challenge  (gi,g2,ui,u2)  is  a  Diffie-Hellman  tuple  then  given  7,  the  ci¬ 
phertext  C  is  a  random  valid  ciphertext.  Otherwise,  given  I  the  ciphertext  C  is  uniformly  distributed 
in  G2k+l.  Since  the  decoder  behaves  differently  for  valid  and  invalid  ciphertexts  the  output  in  Step  9 
enables  us  to  solve  the  given  DDH  challenge.  O 

By  querying  at  invalid  ciphertexts  the  tracer  learns  the  value  []  hf6i  =  S/A  for  vectors  z  of  its 
choice.  After  2k  queries  with  random  linearly  independent  z.  the  tracer  can  solve  for  hf1 , . . . ,  h% 2kk . 
Since  the  tracer  knows  the  discrete  log  of  the  h/s  base  g  (recall  Step  2  of  key  generation)  it  can 
compute  gSl,...,gSik.  Ideally,  we  would  like  to  use  homomorphic  properties  of  the  discrete  log  to 
run  the  tracing  algorithm  of  the  previous  section  “in  the  exponents”.  Unfortunately,  it  is  an  open 
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problem  to  run  Berlekamp’s  algorithm  this  way.  Instead,  we  can  recover  the  vector  d={5i,.. .,S2k) 
from  by  using  recent  results  on  trapdoors  of  the  discrete  log  [12,  14]  modulo  p2q  and 

modulo  N2.  For  instance,  the  trapdoor  designed  by  Paillier  [14]  shows  that  if  encryption  is  done  in  the 
group  then  the  secret  factorization  of  N  (known  to  the  tracer  only)  enables  the  tracer  to  recover 
(<5i, . . . , S2k)  mod  N  from  ,g*2k).  The  tracing  algorithm  of  the  previous  section  can  now  be 

used  to  recover  the  keys  at  the  pirate’s  possession.  This  completes  the  description  of  the  black  box 
tracing  algorithm  for  single-key  pirates. 


6  Chosen  ciphertext  security 

In  a  typical  scenario  where  our  system  is  used  it  is  desirable  to  defend  against  chosen  ciphertext 
attacks.  Fortunately,  our  scheme  can  be  easily  modified  to  be  secure  against  adaptive  attacks.  The 
modification  is  similar  to  the  approach  used  by  Cramer  and  Shoup  [6].  As  in  Section  3  we  work  in  a 
group  Gq  of  prime  order  q.  For  example,  Gq  could  be  a  subgroup  of  order  q  of  R  for  some  prime  p 
where  q\p  -  1.  P 


Key  generation:  Let  g  be  a  generator  of  Gq.  Pick  random  n, . . .  ,r2k  €  F,  and  set  hi  =  gTi  for 
*  =  Next,  we  pick  random  x\,x2, Vi , y2  £  F9  and  ai , . . . ,  a2k  €  F9  and  compute 

y  =  h°'h?---h%k  ;  c  =  h*'hx2>  ;  d  =  h\'hf 

The  public  key  is  (y,  c,  d,  hi, . . . ,  h2k).  The  private  key  is  as  in  Section  3,  but  also  includes  (xi ,  x2,  yi,y2). 
Hence,  user  i’s  private  key  is  {di,X\,x2,y\,y2). 

Encryption:  To  encrypt  a  message  M  €  Gq  do  the  following:  pick  a  random  element  o  e  F9,  and 
compute 

S  =  M-ya ;  H1  =  ha1,  ,  H2k=ha2k 
u  =  H(S,H1,...,H2k);  v  =  cadav 

where  H  is  a  collision  resistant  hash  function  (or  chosen  from  a  family  of  universal  one-way  hash 
functions).  Set  the  ciphertext  C  to  be 


C  —  (S,Hi,...,H2k,v) 

It  is  a  bit  surprising  that  the  system  can  be  made  secure  against  chosen  ciphertext  attacks  by  appending 
a  single  element  v  to  the  ciphertext. 


Decryption:  To  decrypt  a  ciphertext  C  —  (S, H\, . . .  ,H2k,v)  using  a  private  key  (0i,xi,x2,yi,y2) 
first  compute  v  =  U{S, Hx ,..., H2k)  and  check  that 


JjT\+yW  '  JJX2+V2V 


V 


If  the  test  fails,  reject  the  ciphertext.  Otherwise,  output 


2k 

M  —  S/U ^  where  "'IK' 

3= 1 

and  7W  =  (71, . . .  ,72^)  €  T  is  the  codeword  from  which  6{  is  derived. 
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Tracing;  The  tracing  algorithm  remains  unchanged. 

We  show  that  the  scheme  is  secure  against  adaptive  chosen  ciphertext  attack.  In  other  words,  we 
show  that  the  scheme  is  secure  in  the  following  environment:  an  adversary  is  given  the  public  key. 

It  generates  two  messages  Mq,  Mi  and  is  given  the  encryption  G  —  E{M\f)  for  b  6  {0, 1}  chosen  at 
random.  The  adversary’s  goal  is  to  predict  b.  To  do  so  he  is  allowed  to  interact  with  a  decryption 
oracle  that  will  decrypt  any  valid  ciphertext  other  than  C.  If  the  adversary’s  guess  for  b  is  b'  and  the 
probability  that  b  =  V  is  \  +  e  then  we  say  that  the  adversary  has  advantage  e.  The  system  is  said  to 
be  secure  against  an  adaptive  chosen  ciphertext  attack  if  the  adversary’s  advantage  in  predicting  b  is 
negligible  (as  a  function  of  the  security  parameter). 

Theorem  6.1  The  above  cryptosystem  is  secure  against  an  adaptive  chosen  ciphertext  attack  assum¬ 
ing  that  (1)  the  Decision  Diffie-Hellman  problem  is  hard  in  the  group  Gq,  and  (2)  the  hash  function 
ft  is  collision  resistant  (or  chosen  from  a  family  of  universal  one-way  hash  functions). 

We  assume  the  hash  function  H  is  collision  resistant.  Suppose  there  exists  a  polynomial  time  adver¬ 
sary  A  that  is  able  to  obtain  a  non-negligible  advantage  in  predicting  6  when  the  above  cryptosystem 
is  used.  We  show  that  A  can  be  used  to  solve  the  Decision  Diffie-Hellman  problem  in  Gq. 

Given  a  tuple  in  Gq  we  perform  the  following  steps  to  determine  if  it  is  a  random 

tuple  (i.e  chosen  from  R)  or  a  Diffie-Hellman  tuple  (i.e  chosen  from  D): 

Init  Set  h\  =  gi  and  h2  =  92-  pick  random  r$, . . . ,  r2k  £  and  set  hi  —  g%  for  i  =  3, . . . ,  2ft.  Next, 
choose  random  xi,x2, yi, 2/2  £  F?  and  Qi , .  ■  ■ ,  at2k  €  F9  and  compute 

y  =  5  c  =  /W;  d  =  hfhf 

Challenge  The  adversary  A  is  given  the  public  key  and  outputs  two  messages  Mo,  Mi  €  Gq.  We 
pick  a  random  b£  {0, 1}  and  compute: 

2k 

S  =  Mb-u^u?Y[u°'r' 

i= 3 

Hi  =  ui ,  H2  =  U2  ,  H$  =  u23  ,  •  •  •  ,  H2k  ~  u2“ 

. %,); 

The  challenge  ciphertext  given  to  A  is  G  =  (5,  H\ , . . . ,  H2k,v). 

Interaction  When  the  adversary  A  asks  to  decrypt  a  ciphertext 

C*  =  (S',  -Hi, ,  #2*1 

we  respond  as  in  a  normal  decryption:  first  we  check  validity  of  the  ciphertext  and  reject  invalid 
ciphertexts.  For  valid  ciphertext  we  give  A  the  plaintext  M  —  S’ /  . 

Output  Eventually  the  adversary  A  outputs  a  bf  €  {0, 1}.  If  b  -  U  we  say  the  input  tuple  is  from  D 
otherwise  we  say  R. 

This  completes  the  description  of  the  algorithm  for  deciding  DDH  using  A .  To  complete  the  proof 
of  Theorem  6.1  it  remains  to  show  two  things: 

•  When  (01,02,  ui,u2)  is  chosen  from  D  the  joint  distribution  of  the  adversary’s  view  and  the  bit 
b  is  statistically  indistinguishable  from  the  actual  attack. 
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•  When  (0i,02>uii  v>2)  is  chosen  from  R  the  hidden  bit  b  is  (essentially)  independent  of  the  adver¬ 
sary’s  view. 

The  proofs  of  both  statements  are  similar  to  the  proofs  given  by  Cramer  and  Shoup  [6]  and  is  given  in 
the  appendix.  Based  on  the  two  statements  a  standard  argument  shows  that  if  the  adversary  A  has 
advantage  e  in  predicting  b  then  the  above  algorithm  for  deciding  DDH  also  has  advantage  e.  This 
completes  the  proof  of  Theorem  6.1. 

Key  extraction  and  black  box  tracing.  It  is  surprising  that  although  the  scheme  is  resistant 
to  chosen  ciphertext  attack,  the  decryption  box  will  decrypt  invalid  ciphertexts.  In  particular,  it  will 
decrypt  an  invalid  ciphertext  C  =  (S1  Hu  • . . ,  #2fc,  v)  where 

S=M-ya ;  H1  =  ha1,  H2  =  ha2,  H3  =  hbf,  =  H2k  =  ft* 

v  =  U(S,Hu...,H2k)-,  v  =  cacP,' 

This  is  an  invalid  ciphertext  since  the  Vs  are  raised  to  different  powers.  It  passes  the  decryptor’s 
test  since  h\  and  3X6  raised  to  the  same  power.  It  cannot  be  distinguished  from  a  valid  ciphertext, 
assuming  the  hardness  of  DDH  in  Gq.  The  ideas  of  Section  4  can  then  be  applied  for  black  box  tracing 
and  black  box  confirmation  in  this  setting.  For  example,  suppose  a  “single-key  pirate”  constructs 
a  pirate  decoder  containing  the  representation  d  =  (di,. . .  ,d2fc)>  By  feeding  the  decoder  invalid 
ciphertexts  as  above  the  tracer  can  recover  {gd*y . . .  ,  £rf2fc).  Using  trapdoors  of  the  discrete  log  the 
tracer  can  find  {d3,...,d2fc)  from  which  he  can  expose  all  active  members  of  a  coalition  of  size  at 
most  k  —  1.  As  in  section  4  black  box  tracing  is  possible  against  any  pirate  by  using  “black  box 
confirmation”. 


7  An  open  problem 

The  black  box  confirmation  method  of  Section  4  gives  rise  to  a  very  general  black  box  tracing  strategy. 
The  tracing  strategy  works  even  when  the  pirate  box  has  £  - 1  out  of  the  total  of  i  keys.  Let  du . . . ,  dt 
be  the  set  of  all  i  private  keys  generated  by  the  key  generation  algorithm.  Let  T*  be  the  set  of  keys 
{di,  There  are  i  such  sets  with  Tq  being  the  empty  set  and  Ti  being  the  set  of  all  private  keys. 

Consider  an  encryption  scheme  with  the  following  properties: 

•  Given  a  message  M  and  an  i  €  {0,  ...,£}  there  is  a  set  of  ciphertexts  GT{  such  that  C  e  CTi 
decrypts  correctly  to  M  using  all  private  keys  in  Tj,  but  decrypts  to  some  Mr  ^  M  when  using 
keys  not  in  T{  (there  could  be  a  different  M 1  for  each  d  £  Tj).  Note  that  all  of  CTt  are  valid 
ciphertexts  since  they  decrypt  correctly  to  M  under  all  private  keys. 

•  An  adversary  given  all  t  private  keys  except  for  key  dj  cannot  tell  whether  a  ciphertext  C 
decrypts  to  M  using  key  dj . 

Any  ecryption  scheme  satisying  both  properties  above  gives  rise  to  an  efficient  black  box  tracing 
algorithm.  Let  CT{(M)  be  the  set  of  ciphertexts  that  decrypt  to  M  using  any  key  in  T*.  Let  CTi  be 
the  union  of  CTi(M)  over  all  M.  Consider  a  pirate  box  that  plays  music.  Define 

pi  =  Pr  [pirate  box  plays  music  when  given  x] 
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We  know  that  pn  =  1  since  CTn  is  the  set  of  valid  ciphertexts.  Similarly,  we  know  Po  =  0  since  no  key 
can  decrypt  messages  in  CTq.  Hence,  since  |pn— Pol  =  1)  there  must  exist  a  j  such  that  |pj—  Pj+il  >  !/(• 
The  only  difference  between  CTj  and  CTj+i  is  that  key  dj+1  can  decrypt  ciphertexts  in  CTj+x  but 
cannot  decrypt  ciphertexts  in  CTj.  Hence,  the  pirate  box  is  able  to  distinguish  whether  key  dj  correctly 
decrypts  the  given  ciphertext.  By  property  (2)  this  is  possible  only  if  the  pirate  had  dj  when  building 
the  pirate  box.  Hence,  dj  can  be  pronounced  guilty. 

To  summarize,  the  tracing  algorithm  works  as  follows: 

Step  1:  For  i  =  0,  ...,£  estimate  the  value  of  p*.  This  is  done  by  picking  many  random  elements  in 
C  e  CTi  and  testing  whether  the  pirate  box  "plays  music”  when  given  C.  Using  0(£)  samples 
will  produce  an  approximation  pj  to  pj  with  error  less  than  l/4£  with  high  probability. 

Step  2:  Find  a  j  such  that  \pj  -  Pj+i|  >  1/81  Output  “dj  is  guilty”  and  stop. 

The  tracing  algorithm  makes  a  total  of  0(£2  log£)  probes  into  the  pirate  box.  In  an  encryption 
scheme  satisfying  both  properties  above  each  message  M  can  be  encrypted  in  at  least  l  +  1  different 
ways  (Co, ...,  Ct).  Hence,  the  ciphertext  length  must  be  at  least  log2 1.  Unfortunately,  the  current  best 
constructions  require  a  ciphertext  whose  length  is  linear  in  the  number  of  users  1. 

Open  problem:  Construct  a  secure  encryption  scheme  satisfying  both  properties  above  where  the 
ciphertext  length  is  sub-linear  in  l.  Ideally  the  ciphertext  length  should  be  0(log  £). 


8  Conclusion 

We  present  an  efficient  public  key  solution  to  the  traitor  tracing  problem.  Our  construction  is  based 
on  Reed-Solomon  codes  and  the  representation  problem  for  discrete  logs.  Traceability  follows  from 
the  hardness  of  discrete  log.  The  semantic  security  of  the  encryption  scheme  against  a  passive  attack 
follows  from  the  Decision  Diffie-Hellman  assumption.  A  simple  extension  achieves  security  against 
an  adaptive  chosen  ciphertext  attack  under  the  same  hardness  assumption.  The  private  key  in  all 
cases  is  just  a  single  element  of  a  finite  field  and  can  be  as  short  as  160  bits.  The  cryptosystem 
can  be  made  to  work  in  any  group  in  which  the  Decision  Diffie-Hellman  problem  is  hard.  It  is  an 
interesting  open  question  to  improve  on  the  “black  box"  traceability  of  our  approach.  Also,  it  seems 
reasonable  to  believe  that  there  exists  an  efficient  public  key  traitor  tracing  scheme  that  is  completely 
collusion  resistant.  In  such  a  scheme,  any  number  of  private  keys  cannot  be  combined  to  form  a  new 
key.  Similarly,  the  complexity  of  encryption  and  decryption  is  independent  of  the  size  of  the  coalition 
under  the  pirate’s  control.  An  efficient  construction  for  such  a  scheme  will  provide  a  useful  solution 
to  the  public  key  traitor  tracing  problem. 

To  conclude,  we  mention  an  application  of  our  system  to  defending  against  software  piracy.  Top¬ 
ically,  when  new  software  is  installed  from  a  CD-ROM  the  user  is  asked  to  enter  a  short  unique  key 
printed  on  the  CD  cover.  This  key  identifies  the  installed  copy.  Clearly  the  key  printed  on  the  CD 
cover  has  to  be  short  (say  under  20  characters)  since  it  is  typed  in  manually.  Our  system  can  be  used 
in  this  settings  as  follows:  since  our  private  key  can  be  made  120  bits  long  (to  achieve  260  security)  it 
can  be  printed  on  the  CD  cover  (each  character  encodes  6  bits).  The  software  on  the  CD  is  encrypted 
using  our  system’s  public  key.  When  the  user  types  in  his  unique  CD  key  the  software  is  decrypted 
and  installed  on  the  user’s  machine.  However,  if  a  software  pirate  attempts  to  create  illegal  copies 
of  the  distribution  CD  (say  using  a  CD-ROM  burner)  he  must  also  attach  a  short  printed  key  to  the 
disk.  Using  our  system,  the  key  he  attaches  to  the  bootlegged  copies  can  be  traced  back  to  him. 
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A  Proof  of  Theorem  6.1 

To  complete  the  proof  of  Theorem  6.1  we  need  to  prove  the  following  lemma  regarding  the  algorithm 
for  deciding  DDH  in  Gq. 

Lemma  A.l 

a.  When  (gi,g2,ui,u2)  is  chosen  from  D  the  joint  distribution  of  the  adversary’s  view  and  the  bit 
b  is  statistically  indistinguishable  from  the  actual  attack. 

b.  When  {gi,g2,ui,u2)  is  chosen  from  R  the  hidden  bit  b  is  (essentially)  independent  of  the  adver¬ 
sary’s  view. 

Proof  The  proof  is  very  similar  to  the  one  given  by  Cramer  and  Shoup  [6].  Recall  that  the  challenge 
ciphertext  given  to  the  adversary  is  C  =  (S, Hi, . . . ,  H2k,  v)  where 

2k 

S  =  Mb-u?u?  n«2ir<;  =  H2  =  u2,  Hz  =  ur23  ,  ,  H2k  = 

t=3 

v  =  H(S,Hu...,H2k)-,  v  =  uT+yiVu^v 

We  start  by  proving  part  (a).  Say  (01,02, uuu2.)  IS  chosen  from  D.  Then  there  exists  an  a  such 
that  ui  =  gl  and  u2  =  g2-  In  this  case,  the  challenge  ciphertext  C  given  to  the  adversary  is  a  valid 
encryption  of  M&.  Indeed,  one  can  easily  verify  that  the  challenge  ciphertext  C  satisfies: 

S  =  ya 

Hi  =  h}  for  «  = 
v  =  cadva 

Furthermore,  during  the  interaction  phase  with  the  adversary  A  the  decryption  oracle  given  to  A 
behaves  exactly  as  it  does  in  the  actual  attack. 

The  proof  of  part  (b)  is  a  bit  harder.  Say  (01,02^1,^2)  is  chosen  from  R.  Let  m  =  0?1  and 
u2  =  0^2.  With  high  probability  a\  ^  a2.  Also,  let  g\  =  g™  (recall  h\  =  01  and  h2  =  g2). 

We  say  that  a  ciphertext  C  —  (5,  H\, . . . ,  H2f~,  v)  is  valid  if  log5l  H\  =  logff2  H2.  The  proof  of  part 
(b)  follows  from  the  following  two  claims: 

Claim  A.2  If  the  decryption  oracle  given  to  the  adversary  A  rejects  all  invalid  ciphertexts  then  the 
distribution  of  the  bit  b  is  independent  of  the  adversary's  view. 

Proof  The  value  of  S  in  the  challenge  ciphertext  is  S  =  Mb  •  u“lu£2  IIte3  t42*r*-  silow  that  tlie 
adversary  has  no  information  about  the  blinding  factor  B  where 

2k 

B  —  U*lU%2 

i=3 

Consequently,  the  adversary  obtains  no  information  about  Afy  —  the  blinding  factor  B  is  a  perfect  one 
time  pad. 


11 


Consider  the  point  Q  —  (ai, . . .  ,  a2fc).  Given  the  public  key  the  adversary  knows  that  Q  is  a 
random  point  satisfying: 

2k 

l°gg2  y  -  wal  +  Q2  +  r*Q«  (mod  9)  (2) 

t=3 

When  the  decryption  oracle  decrypts  a  ciphertext  C'  —  (S', H(, ... , Hi ,k, v')  it  gives  the  adversary  the 

value  M'  =  S'/(H()ai  ■  ■  ■  (H'2k)a2k.  Hence,  the  adversary  learns  that  the  point  Q  satisfies  the  following 
relation: 


S' 

log92  Jf,  =  “1  logS2  H[  +  a2  logS2  H'z  +  . . .  +  a2k  log92  H'2k  (3) 

Since  the  decryption  oracle  decrypts  only  valid  ciphertexts  we  know  that  logSl  H[  =  logS2  H'2,  or 
equivalently  logff2  H[  =  w  logg2  H2.  If  we  write  s  =  log52  H2  then  equation  (3)  becomes: 

S' 

log<?2  Jfi  =  M«i  +  s<*2  +  a3  logP2  H[  +  . . .  +  a2k  \ogg2  H'2k  (4) 


Observe  that  equation  (2)  and  all  of  the  equations  (4)  obtained  by  the  adversary  are  restricted  to  a 
subspace  of  dimension  2k  —  1  (the  coefficient  of  c*i  is  always  w  times  the  coefficient  of  c^)* 

To  complete  the  proof  of  the  claim  note  that  the  blinding  factor  B  in  the  challenge  ciphertext 
satisfies: 

2k 


1°S$2  &  =  (wai)ai  +  a2C*2  + 


i=3 


Assuming  ai  ^  a2  this  relation  on  Q  is  linearly  independent  of  equation  (2)  and  all  the  equations  (4). 
Hence,  conditioned  on  the  adversary’s  view,  the  blinding  factor  B  is  uniformly  distributed  in  F9 .  It  fol¬ 
lows  that  the  adversary  has  no  information  about  Mb  implying  that  its  output  b'  is  independent  of  6.  □ 


Claim  A. 3  The  decryption  oracle  we  supply  to  the  adversary  A.  will  reject  all  invalid  ciphertexts, 
except  with  negligible  probability. 

Proof  The  proof  is  based  on  the  fact  that  the  adversary  does  not  know  the  point  P  =  (xi,  x2,  j/i,  y2)- 
From  the  public  key  and  the  challenge  ciphertext  the  adversary  knows  that  the  point  P  satisfies  the 
following  relations: 

l°gS2c  =  xiw  +  x2  (5) 

log  g2d  =  Viw  +  y2  (6) 

l°gS2u  =  (xi  +  yi^wai  +  (x2  +  y2v)a2  (7) 

Suppose  the  adversary  queries  the  decryption  oracle  at  an  invalid  ciphertext  C'  =  (S', H(,...,H'2k, v') 

where  H[  =  s‘>  and  H[  =  for  i  =  2, . . . , 2k.  Let  </  =  H(S', H(,...,  H'2k).  There  are  three  cases  to 
consider: 


Case  1:  (S',  Hi , . . . ,  H2k)  -  (S',  H[ , . . . ,  Hf2k).  Since  C  ^  C*  we  have  v  ^  v*.  Since  vf  is  invalid  the 
decryption  oracle  will  reject  C". 

Case  2:  (S,  H\, ... ,  H2k)  ^  (5",  H[, . . . ,  H*2k)  and  v  =  vf.  This  implies  the  adversary  found  a  collision 

in  the  hash  function  ?{(),  contradicting  the  assumption  on  HQ. 


iii 


Case  3:  (S,Hu 
ciphertext,  unless 


H2k)  /  {S',H[,...,H'2k)  and  v  /  v'.  The  decryption  oracle  will  reject  the 

jjX  1+1/1^  .  _  yt 


In  other  words, 

logS2  v '  =  wb\(xi  4*  yiis')  4-  62(^2  4*  V2V') 


(8) 


However,  Equations  (5), (6), (7)  and  (8)  axe  linearly  independent.  This  can  be  seen  from  the  determinant 


of 


det 


f  w  1  0  0  ^ 

0  0  w  1 

wa\  a2  wa\v  a2v 

\  wbi  62  wbiv'  fov'  } 


=  w2(i /  —  v')(bi  —  b2)(ai  —  02) 


Since  the  ciphertext  is  invalid  we  know  that  61  ^  62  and  hence  the  determinant  is  non  zero.  Equation 
(5),  (6)  and  (7)  tell  the  adversary  that  P  lies  on  a  certain  line  C.  Hence,  P  is  one  of  q  possible 
points.  By  linear  independence,  the  hyperplane  defined  by  (8)  intersects  the  fine  £  at  a  point.  For  the 
adversary’s  first  query  the  probability  (over  the  choice  of  P)  that  the  intersection  occurs  at  the  point 
P  is  1/9.  Hence,  the  invalid  ciphertext  will  be  accepted  by  the  decryption  oracle  with  probability  at 
most  l/q.  After  the  first  query  is  rejected  the  adversary  knows  that  the  point  P  is  now  one  of  only 
q  —  1  different  values  along  C.  Hence,  for  the  second  query,  the  probability  that  £  intersects  equation 
(8)  at  P  is  at  most  For  the  t’th  query  the  probability  is  at  most  jrj.  Overall,  the  probability 

that  an  invalid  ciphertext  is  ever  accepted  is  at  most  ££=1 5=?  -  w^ere  n  t*lc  tota*  num^er 
of  queries.  Since  the  adversary  runs  in  polynomial  time  q  is  exponential  in  n.  Hence,  the  probability 
that  an  invalid  ciphertext  is  ever  accepted  is  negligible.  D 
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Abstract.  In  the  recent  years,  many  formalizations  of  security  proper¬ 
ties  have  been  proposed,  most  of  which  axe  based  on  different  underlying 
models  and  are  consequently  difficult  to  compare.  A  classification  of  se¬ 
curity  properties  is  thus  of  interest  for  understanding  the  relationships 
among  different  definitions  and  for  evaluating  the  relative  merits.  In  this 
paper,  many  non-interference-like  properties  proposed  for  computer  secu¬ 
rity  are  classified  and  compared  in  a  unifying  framework.  The  resulting 
taxonomy  is  evaluated  through  some  case  studies  of  access  control  in 
computer  systems.  The  approach  has  been  mechanized,  resulting  in  the 
tool  CoSeC.  Various  extensions  (e.g.,  the  application  to  cryptographic 
protocol  analysis)  and  open  problems  are  discussed. 

This  paper  mainly  follows  [21]  and  covers  the  first  part  of  the  course 
“Classification  of  Security  Properties”  given  by  Roberto  Gorrieri  and 
Riccardo  Focardi  at  FOSAD’OO  school. 


1  Introduction 

The  wide  spread  of  distributed  systems,  where  resources  and  data  are  shared 
among  users  located  almost  everywhere  in  the  world,  has  enormously  increased 
the  interest  in  security  issues.  In  this  context,  it  is  likely  that  a  user  gets  some 
(possibly)  malicious  programs  from  an  untrusted  source  on  the  net  and  executes 
them  inside  its  own  system  with  unpredictable  results.  Moreover,  it  could  be 
the  case  that  a  system  completely  secure  inside,  results  to  be  insecure  when 
performing  critical  activities  such  as  electronic  commerce  or  home  banking,  due 
to  a  “weak”  mechanism  for  remote  connections.  It  is  important  to  precisely 
define  security  properties  in  order  to  have  formal  statements  of  the  correctness 
of  a  security  mechanism.  As  a  consequence,  in  the  recent  years  there  have  been  a 

*  This  work  has  been  partially  supported  by  MURST  projects  TOSCA,  “Certificazione 
automatica  di  programmi  mediante  interpretazione  astratta”  and  “Interpret azione 
astratta,  type  systems  e  analisi  control-flow”,  and  also  partially  supported  by  Mi¬ 
crosoft  Research  Europe. 
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Fig.  1.  Information  flows  in  multilevel  security. 


the  whole  flow  of  information,  rather  than  the  accesses  of  subjects  to  objects  [36]. 
By  imposing  some  information  flow  rules,  it  is  possible  to  indifferently  control 
direct  and  indirect  leakages,  as,  in  this  perspective,  they  both  become  “unwanted 
information  flows” . 

In  the  literature,  there  are  many  different  security  definitions  reminiscent  of 
the  information  flow  idea,  each  based  on  some  system  model  (see,  e.g.,  [36,64, 
41,48,49,62,7]).  In  [24]  we  have  compared  and  classified  them,  leading  to  our 
proposed  notion  of  Bisimulation  Non  Deducibility  on  Compositions  ( BNDC ,  for 
short).  We  will  present  BNDC  starting  by  the  idea  of  Non  Interference  [36]. 
Through  a  running  example  and  a  comparison  with  other  existing  approaches, 
we  will  try  to  convince  the  reader  that  such  a  property  can  effectively  detect 
unwanted  information  flows  in  systems,  both  direct  and  indirect. 

We  now  describe  the  topics  of  the  next  sections.  Section  2  presents  the  Secu¬ 
rity  Process  Algebra  (SPA,  for  short)  language.  All  the  properties  we  will  present 
and  apply  to  the  analysis  of  systems  and  protocols  are  based  on  such  a  language. 
SPA  is  an  extension  of  CCS  [50]  -  a  language  proposed  to  specify  concurrent 
systems.  The  basic  building  blocks  are  the  atomic  activities,  simply  called  ac¬ 
tions ;  unlike  CCS,  in  SPA  actions  belong  to  two  different  levels  of  confidentiality, 
thus  allowing  the  specification  of  multilevel  (actually,  two-level)  systems.  As  for 
CCS,  the  model  used  to  describe  the  operational  semantics  of  SPA  is  the  labelled 
transition  system  model  [43],  where  the  states  are  the  terms  of  the  algebra.  In  or¬ 
der  to  express  that  certain  states  are  indistinguishable  for  an  external  observer, 
semantic  equivalences  over  terms/states  are  defined  such  that  two  terms  are  ob- 
servationally  indistinguishable  iff  they  are  equivalent.  As  explained  below,  the 
information  flow  security  properties  we  introduce  are  all  based  on  these  notions 
of  observable  behaviours. 

Section  3  is  about  such  properties,  that  capture  the  existence  of  information 
flows  among  groups  of  users.  We  will  see  that  these  properties  are  all  of  the 
following  algebraic  form.  Let  E  be  an  SPA  process  term,  let  A"  be  a  security 


Finally,  in  Section  5  we  give  some  concluding  remarks  and  discuss  some  open 
problems. 

2  SPA  and  Value-Passing 

In  this  Section  we  present  the  language  that  will  be  used  to  specify  and  analyze 
security  properties  over  concurrent  systems.  We  first  present  the  “pure”  version 
of  the  language.  Then  we  show  how  to  extend  it  with  value-passing.  Finally,  we 
present  an  example  of  value-passing  agent  specification.  It  will  be  our  running 
example  for  the  next  sections. 


2.1  The  Language 

The  Security  Process  Algebra  (SPA  for  short)  [24, 26]  is  a  slight  extension  of 
Milner’s  CCS  [50],  where  the  set  of  visible  actions  is  partitioned  into  high  level 
actions  and  low  level  ones  in  order  to  specify  multilevel  systems.  2 

SPA  syntax  is  based  on  the  same  elements  as  CCS.  In  order  to  obtain  a 
partition  of  the  visible  actions  into  two  levels,  we  consider  two  sets  Actn  and 
A&L  of  high  and  low  level  actions  which  are  closed  with  respect  to  function  7 
(i.e.,  ActH  =  Actff,  ActL  =  AcbL)\  moreover  they  form  a  covering  of  C  and 
they  are  disjoint  (i.e.,  Actn  U  ActL  =  A  Actn  ft  ActL  —  0).  Let  Act  be  the  set 
Actn  U  ActL  U{t},  where  r  is  a  special  unobservable,  internal  action.  The  syntax 
of  SPA  agents  (or  processes)  is  defined  as  follows: 

E::=0  \  ix.E  \  E  +  E  \  E\E  \  E\\L  \  E[f]  \  Z 

where  n  ranges  over  Act,  L  C  C  and  /  :  Act  — *  Act  is  such  that  f(a)  = 
f(a),  f(r)  =  r.  Moreover,  for  every  constant  Z  there  must  be  the  corresponding 

definition:  Z  =f  E,  and  E  must  be  guarded  on  constants.  This  means  that  the 
recursive  substitution  of  all  the  non  prefixed  (i.e.,  not  appearing  in  a  context 
pi.E')  constants  in  E  with  their  definitions  terminates  after  a  finite  number  of 
steps.  In  other  words,  there  exists  a  term  obtainable  by  constant  substitutions 
from  E  where  all  the  possible  initial  actions  are  explicitly  represented  (through 

def  def 

the  prefix  operator  p,.E).  For  instance,  agent  A  =  B  with  B  =  A  is  not  guarded 
on  constants.  On  the  contrary,  if  B  is  defined  as  a.A,  then  B  is  guarded  on 
constants.  This  condition  will  be  useful  when  we  will  do  automatic  checks  over 
SPA  terms.  As  a  matter  of  fact,  it  basically  avoids  infinite  constant  substitution 
loops. 

Intuitively,  we  have  that  0  is  the  empty  process,  which  cannot  do  any  action; 
pL.E  can  do  an  action  \x  and  then  behaves  like  E\  Ei  4-  E2  can  alternatively 

2  Actually,  only  two- level  systems  can  be  specified;  note  that  this  is  not  a  real  limitation 
because  it  is  always  possible  to  deal  with  the  multilevel  case  by  grouping  -  in  several 
ways  -  the  various  levels  in  two  clusters. 
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BiAb;  e2  A  e'2 


Ei  +  E2  E\  Ei  -f-  E2  — +  E2 


Parallel 


Ei-^E'i 


TP  J*  Tpt 

£j2  -E'2 


Ei  A  E[  E2^E'2 


Restriction 


Relabelling 


Constant 


Ei\E2^E'i\E2  Ei\E2  A  Ei\E2  Ei\E2  -4  E[\E’2 
E-^E' 


BWiiB'WL 
E-^*  E' 


if  ft  L 


m  w  E'if] 

ip  j\  jpt 

£j  — >  ft  j-r 

if  A  =f  E 


A-=*E' 


Table  1.  The  operational  rules  for  SPA. 


as  for  CCS  by  structural  induction  as  the  least  relation  generated  by  the  axioms 
and  inference  rules  reported  in  Table  1.  The  operational  semantics  for  an  agent 
E  is  the  subphrt  of  the  SPA  LTS  reachable  from  the  initial  state  E.  We  denote 
with  Efs  the  set  of  all  the  SPA  agents  with  a  finite  lts  as  operational  semantics. 
Table  2  shows  some  simple  examples  of  SPA  terms  with  their  corresponding  LTSs. 

Now  we  introduce  the  idea  of  observable  behaviour:  two  systems  should  have 
the  same  semantics  if  and  only  if  they  cannot  be  distinguished  by  an  external 
observer.  To  obtain  this  we  define  an  equivalence  relation  over  states/ terms  of 
the  SPA  LTS,  equating  two  processes  when  they  are  indistinguishable.  In  this 
way  the  semantics  of  a  term  becomes  an  equivalence  class  of  terms. 

It  is  possible  to  define  various  equivalences  of  this  kind,  according  to  the 
different  assumptions  on  the  power  of  observers.  We  recall  three  of  them.  The 
first  one  is  the  classic  definition  of  trace  equivalence ,  according  to  which  two 
agents  are  equivalent  if  they  have  the  same  execution  traces.  The  second  one 
discriminates  agents  also  according  to  the  nondeterministic  structure  of  their 
LTSs.  This  equivalence  is  based  on  the  concept  of  bisimulation  [50].  The  last  one, 
introduced  for  the  CSP  language  [39],  is  able  to  observe  which  actions  are  not 
executable  after  a  certain  trace  ( failure  sets),  thus  detecting  possible  deadlocks. 

Since  we  want  to  focus  only  on  observable  actions,  we  need  a  transition 
relation  which  does  not  take  care  of  internal  r  moves.  This  can  be  defined  as 
follows: 


Definition  1.  The  expression  E  E '  is  a  shorthand  for  E(^*)*Ei  E2(—* 

)*Ef,  where  (-^)*  denotes  a  (possibly  empty)  sequence  of  r  labelled  transitions. 
Let  j  —  ai ..  .an  e  C*  be  a  sequence  of  actions;  then  E  E'  if  and  only  if 
there  exist  E\,E2,  . . . ,  En-\  G  8  such  that  E  E\  En-\  E' . 

For  the  empty  sequence  {)  we  have  that  E  ==>  Ef  stands  for  E(^*)*E'.  We  say 
that  Ef  is  reachable  from  E  when  3*y  :  E  ===>  Ef  and  we  write  E  =>  Ef .  ■ 

Trace  Equivalence  We  define  trace  equivalence  as  follows: 

Definition  2.  For  any  E  G  8  the  set  T(E)  of  traces  associated  with  E  is  defined 
as  follows :  T(E)  =  {7  G  C*  |  3E'  :  E  =5*  E'}.  E  and  F  are  trace  equivalent 
(notation  E  F)  if  and  only  ifT(E)  =  T(F).  ■ 

Observational  Equivalence  Bisimulation  is  based  on  an  idea  of  mutual  step- 
by-step  simulation,  i.e.,  when  E  executes  a  certain  action  moving  to  El ,  F  must 
be  able  to  simulate  this  single  step  by  executing  the  same  action  and  moving  to 
an  agent  Ff  which  is  again  bisimilar  to  E '  (this  is  because  it  must  be  able  to 
simulate  every  successive  step  of  F'),  and  vice-versa. 

A  weak  bisimulation  is  a  bisimulation  which  does  not  care  about  internal  r 
actions.  So,  when  F  simulates  an  action  of  F,  it  can  also  execute  some  r  actions 
before  or  after  that  action. 

In  the  following,  E  ===>  E'  stands  for  E  ===*►  E'  if  /i  G  C\  and  for  E  (— * ►)*  E1 
if  ^  =  r  (note  that  (- > )*  means  “zero  or  more  r  labelled  transitions”  while 
requires  at  least  one  t  labelled  transition). 

Definition  3.  A  relation  R  C  8  x  8  is  a  weak  bisimulation  if  ( F,F )  G  R 
implies,  for  all  \i  6  Act, 

—  whenever  E  Ef ,  then  there  exists  F*  G  8  such  that  F  Ff  and 

{E',F')eR; 

-  conversely,  whenever  F  F' ,  then  there  exists  Er  G  8  such  that  E  F' 
and  (F',  F')  G  R. 

Two  SPA  agents  E,F  G  8  are  observationally  equivalent,  notation  E  F,  if 
there  exists  a  weak  bisimulation  containing  the  pair  (F,  F).  ■ 

In  [50]  it  is  proved  that  is  an  equivalence  relation.  Moreover,  it  is  easy  to 
see  that  E  F  implies  E  F;  indeed,  if  E  F  then  F  must  be  able  to 
simulate  every  sequence  of  visible  actions  executed  by  E ,  i.e.,  every  trace  of  F; 
since  the  simulation  corresponds  to  the  execution  of  the  actions  interleaved  with 
some  r’s,  then  every  trace  of  E  is  also  a  trace  for  F.  Symmetrically,  E  must  be 
able  to  simulate  every  sequence  of  F.  So  E  F. 

In  Figure  2  there  is  an  example  of  two  trace-equivalent  systems  which  are  not 
observationally  equivalent.  In  fact  both  E  and  F  can  execute  the  three  traces  a, 
ab  and  ac.  However,  it  is  not  possible  for  E  to  simulate  step-by-step  process  F. 


where  S  ~b  is  composition  of  binary  relations,  so  that  E '  ~b  S  ~b  Ff 
means  that  for  some  E"  and  F"  we  have  E*  «b  En ,  (E",F")  €  S,  F'  ~b  F" . 

■ 

In  [50]  it  is  proven  the  following  result: 

Proposition  1.  If  S  is  a  bisimulation  up  to  «b  then  S  Cpsb. 

Hence,  to  prove  P  « b  Q ,  we  only  have  to  find  a  bisimulation  up  to  which 
contains  (P,Q).  This  is  one  of  the  proof  techniques  we  will  often  adopt  in  the 
following. 

Failure/Testing  Equivalence  The  failure  semantics  [9],  introduced  for  the 
CSP  language,  is  a  refinement  of  the  trace  semantics  where  it  is  possible  to 
observe  which  actions  are  not  executable  after  a  certain  trace.  In  particular,  a 
system  is  characterized  by  the  so-called  failures  set,  i.e.,  a  set  of  pairs  (s,X) 
where  s  is  a  trace  and  X  is  a  set  of  actions.  For  each  pair  (s,X),  the  system 
must  be  able,  by  executing  trace  s,  to  reach  a  state  where  every  action  in  X 
cannot  be  executed.4  For  instance,  consider  again  agents  E'  and  F'  of  Figure  3. 
As  we  said  above,  E'  can  stop  after  the  execution  of  a  and,  consequently,  E'  can 
refuse  to  execute  action  b  after  the  execution  of  a.  So  E '  has  the  pair  (a,  {5})  in 
its  failure  set.  System  Ff  is  always  able  to  execute  b  after  the  execution  of  a .  So 
F*  does  not  have  (a,  {6})  in  the  failure  set,  hence  it  is  not  failure  equivalent  to 
Ef.  We  deduce  that  also  failure  semantics  is  able  to  detect  deadlocks.  Moreover, 
also  systems  E  and  F  of  Figure  2  are  not  failure  equivalent. 

A  different  characterization  of  failure  equivalence  (called  testing  equivalence) 
has  been  given  in  [52].  It  is  based  on  the  idea  of  tests.  We  can  see  a  test  T  as 
any  SPA  process  which  can  execute  a  particular  success  action  u  &  C.  A  test  T 
is  applied  to  a  system  E  using  the  parallel  composition  operator  |.  A  test  T  may 
be  satisfied  by  system  E  if  and  only  if  system  {E\T)  \  C  can  execute  u.  Note 
that  in  system  (E\T)  \C  we  force  the  synchronization  of  E  with  test  T. 

Definition  5.  E  may  T  if  and  only  if  ( E\T )  \  L  ^  ( E'\T' )  \  C  ■ 

A  maximal  computation  of  (E\T)  \  C  is  a  sequence  (E\T)  \  C  —  ETq  — ►  ET\  — ► 

. . .  ETn  which  can  be  finite  or  infinite;  if  it  is  finite  the  last  term  must 

have  no  outgoing  transitions.  A  test  T  must  be  satisfied  by  E  if  and  only  if  every 
maximal  computation  of  (E\T)  \  L  contains  a  state  ETi  which  can  execute  u. 

Definition  6.  E  must  T  if  and  only  if  for  all  maximal  computations  of  ETq  — 
(E\T)  \  C ,  3 i  such  that  ETi  ^  ET[  ■ 

Now  we  can  define  testing  equivalence  as  follows: 

Definition  T.  Two  systems  E  and  F  are  testing  equivalent,  E  ^ test  F ,  if  and 
only  if 

4  Indeed,  there  is  a  condition  on  the  traces  s  in  pairs  (s,X).  It  is  required  that  during 
any  execution  of  s  no  infinite  internal  computation  sequences  are  possible.  We  will 
analyze  this  aspect  more  in  detail  in  the  following. 


T 

Fig.  5.  Systems  observationally  equivalent  but  not  testing  equivalent. 


aspect  will  be  analyzed  more  in  detail  in  the  next  chapters,  when  we  will  try  to 
use  &test  in  the  specification  of  security  properties. 

We  conclude  this  section  presenting  a  class  of  finite  state  agents.  This  will  be 
useful  in  the  automatic  verification  of  security  properties.  This  class  of  agents 
consists  of  the  so-called  nets  of  automata: 

p::=0  |  p.p  |  p  +  p  |  Z 
E  ::=  p  j  E\E  |  E\L  \  E\,L  \  EjL  |  E[f] 


def 

where  for  every  constant  Z  there  must  be  the  corresponding  definition  Z  —  p 
(with  p  guarded  on  constants).  It  is  easy  to  prove  that  every  agent  of  this  form 
is  finite  state.  However,  this  condition  is  not  necessary,  in  the  sense  that  other 
agents  not  belonging  to  the  class  of  nets  of  automata  are  finite  state  as  well.  For 
instance,  consider  B  =f  a.0*f  D\  {£}  with  D  =f  i.(q.0|D).  It  can  execute  only  an 
action  a  and  then  it  stops,  so  it  is  clearly  finite  state.  However  note  that  it  does 
not  conform  to  the  syntax  of  nets  of  automata  since  there  is  a  parallel  operator 
underneath  a  sum. 

2.3  Value-Passing  SPA 

In  this  section  we  briefly  present  a  value-passing  extension  of  “pure”  SPA  (VSPA, 
for  short).  All  the  examples  contained  in  this  paper  will  use  this  value  passing 
calculus,  because  it  leads  to  more  readable  specifications  than  those  written 
in  pure  SPA.  Here  we  present  a  very  simple  example  of  a  value-passing  agent 
showing  how  it  can  be  translated  into  a  pure  SPA  agent.  Then  we  define  the 
VSPA  syntax  and  we  sketch  the  semantics  by  translating  a  generic  VSPA  agent 
into  its  corresponding  SPA  agent. 

As  an  example,  consider  the  following  buffer  cell  [50]: 

C  =f  in(x).C'(x) 

Cr(x )  out(x).C 


where  x  is  a  variable  that  can  assume  values  in  N  (we  usually  write  x  £  N).  C 
reads  a  natural  number  n  through  action  in  and  stores  it  in  variable  x.  Then 
this  value  is  passed  to  agent  C'  which  can  give  n  as  output  through  action  out 


We  will  use  the  notation  E{a/b}  to  represent  agent  E  with  ail  the  occurrences 
of  b  substituted  by  a.  We  will  also  use  £+  to  denote  the  set  of  VSPA  agents. 
For  each  agent  E  £  8+  without  free  variables,  its  translation  [E]  is  given  in 
Table  3  where  ari(a)  —  S\  . . .  Sn ;  L  =  '•  l  €  L,  ari{l)  =  S\  . . .  Sn,Vi  € 


Ee£+ 

[E\  €  £ 

0 

0 

a(x i, . . . ,  Xn)'E 

Siefl.nJ.ViESi  . vn\E{vi/xi ,  .  .  .Vn/xn}\ 

a(e i, . . .  ,  en).E 

r.E 

t\E] 

E\  +  E2 

[£1]  +  [E2\ 

El\E2 

[£a]|[£2] 

E\L 

[£J\£ 

E\i  L 

[EH  \/  L 

E/L 

m/i 

E[f] 

A(e  1,.. . , en) 

if  b  then  E 

(  [E]  if  6  =  True 
^  0  otherwise 

Table  3.  Translation  of  VSPA  to  SPA. 


Si, Vi  €  [l,n]}  is  the  set  of  the  translations  of  actions  in  L  and  f(lVl = 
is  the  translation  of  relabelling  function  /.  Furthermore,  the  single 

definition  A{x i, . . .  ,xm)  E  with  ari(A)  =  S\ . .  .Sm,  is  translated  to  the  set 

of  equations: 

{AVl . „m  =  \E{vi/xi, . . .  ,vmlxm}\,Vi  €  SjVi  e  [1 ,  rri] } 

Note  that  we  do  not  partition  the  set  of  actions  into  two  levels;  we  directly 
refer  to  the  partition  in  the  pure  calculus.  In  this  way  it  is  possible  for  a  certain 
action  in  VSPA  to  correspond,  in  the  translation,  to  actions  at  different  levels  in 
SPA.  This  can  be  useful  if  we  want  a  parameter  representing  the  level  of  a  certain 
action.  As  an  example  consider  an  action  access  jr(l,x)  with  l  €  {high,  low}  and 
x  €  [l,n],  representing  a  read  request  from  a  user  at  level  l  to  an  object  x;  we 
can  assign  the  high  level  to  the  actions  with  l  =  high  and  the  low  level  to  the 
others  in  this  way:  access jr(high,  x)  €  Actn  and  accessjr(low,x)  6  Act l  for  all 
x  e  [l,n].  6 

6  Note  that  access  jr  {high,  x)  stands  for  accessor  high,*,  with  x  €  [l,n].  Indeed,  for  the 
sake  of  readability,  we  often  write  c(v)  instead  of  its  translation  cv. 
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Users 


Low 
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Users 


Fig.  6.  The  Access  Monitor  for  Example  1. 


only  from  the  low  one.  Users  interact  with  the  monitor  through  the  following 
access  actions:  accessjr(l,x)}accessjw(l,x),write(liz)  where  l  is  the  user  level 
(/  =  0  low,  l  =  1  high),  x  is  the  object  (x  =  0  low,  x  —  1  high)  and  2  is  the 
binary  value  to  be  written. 

As  an  example,  consider  accessjr(0, 1)  which  represents  a  low  level  user 
(1 l  =  0)  read  request  from  the  high  level  object  (x  =  1),  and  accessjiu(l,0) 
followed  by  write(  1,0)  which  represents  a  high  level  user  (l  =  1)  write  request 
of  value  0(z  =  0)  on  the  low  object  (x  =  0).  Read  results  are  returned  to 
users  through  the  output  actions  val(l}y).  This  can  be  also  an  error  in  case  of  a 
read-up  request.  Note  that  if  a  high  level  user  tries  to  write  on  the  low  object  - 
through  access_iu(l,  0)  followed  by  write(l,z)  -  such  a  request  is  not  executed 
and  no  error  message  is  returned. 

In  order  to  understand  how  the  system  works,  let  us  consider  the  following 
transitions  sequence  representing  the  writing  of  value  1  in  the  low  level  object, 
performed  by  the  low  level  user: 

(Monitor  \  Object(  1, 0)  |  Object( 0, 0))  \  L 

accesirS(0,0)  (write(0,  z).w(0,z).  Monitor  \  Object(l,0)  \  Object(0,0))\  L 
wrtic(o,i)  xyMonitor  |  Objed(\y  0)  |  Objed(0i  0))  \  L 

(Monitor  \  Object(  1, 0)  |  O6jec£(0, 1))  \  L 

The  trace  corresponding  to  this  sequence  of  transitions  is 
access_iu(0, 0).iurz£e(0, 1) 


and  so  we  can  write: 

(Monitor  \  Object(  1,0)  |  Object( 0,0))  \  L 
»«s,»(0^n«e(0,i]  (Monitor  I  Object (1,0)  |  Object( 0, 1))  \  L 

Note  that,  after  the  execution  of  the  trace,  the  low  level  object  contains  value  1. 


interfere  with  the  low  level  if  the  effects  of  high  level  inputs  are  not  visible  by  a 
low  level  user.  This  idea  can  be  rephrased  on  the  LTS  model  as  follows.  We  con¬ 
sider  every  trace  7  of  the  system  containing  high  level  inputs.  Then,  we  look  if 
there  exists  another  trace  7'  with  the  same  subsequence  of  low  level  actions  and 
without  high  inputs.  A  low  level  user,  which  can  only  observe  low  level  actions, 
is  not  able  to  distinguish  7  and  7'.  As  both  7  and  7'  are  legal  traces,  we  can 
conclude  that  the  possible  execution  of  the  high  level  inputs  in  7  has  no  effect 
on  the  low  level  view  of  the  system. 

As  for  NI,  we  can  define  this  property  by  using  some  functions  which  manip¬ 
ulates  sequences  of  actions.  In  particular,  it  is  sufficient  to  consider  the  function 
low  :  £*  — >  Act*L  which  takes  a  trace  7  and  removes  all  the  high  level  actions 
from  it,  i.e.,  returns  the  low  level  subsequence  of  7.  Moreover  we  use  the  func¬ 
tion  highinput :  C*  — *  (Act  h  Hi)*  which  extracts  from  a  trace  the  subsequence 
composed  of  all  the  high  level  inputs. 

Definition  8.  (NNI:  Non- deterministic  Non  Interference) 

E  G  NNI  if  and  only  ifVy  G  T(E ),  35  G  T(E)  such  that 

(z)  low( 7)  =  low(6) 

(ii)  highinput(6)  =  {) 

where  {)  denotes  the  empty  sequence.  ■ 

This  may  be  expressed  algebraically  as: 

Proposition  2.  E  €  NNI  <=s>  ( E\i  Act^/Actu  » t  E/Actn- 

Proof.  (=S>)  It  is  enough  to  show  that  if  E  is  NNI  then  T(E/Actu)  C  T((E\i 
Actn)/ActH ),  because  the  opposite  inclusion  simply  derives  from  T(E\iActn )  ^ 
T(E).  Let  7  G  T(E/ActH );  then,  by  definition  of  /  operator,  37'  €  T(E)  such 
that  low( 7')  =  7.  Since  E  €  NNI  then  35  G  T(E)  such  that  low( 7')  =  low(S) 
and  highinput(5)  —  (}.  Hence  5  G  T(E\i  Actn)  and  5'  =  low(S)  G  T((E  \j 
ActH)/Actn).  Since  7  =  low( 7')  =  low(S)  =  5',  then  7  =  5'  and  thus  7  G 
T((E\r  ActH)/ActH). 

(<*=)  Let  7  G  T(E).  Then  35  G  T(E  \j  ActH)  such  that  low( 7)  =  low(6).  Since 
5  G  T(E  \i  Actff )  then  highinput(S)  —  ()  and  5  G  T(E).  ■ 

As  a  matter  of  fact,  in  EjActu  all  the  high  level  actions  are  hidden,  hence 
giving  the  low  level  view  of  the  system;  E  \/  Actu  instead  prevents  traces  from 
containing  high  level  inputs.  So,  if  the  two  terms  are  equivalent,  then  for  every 
trace  with  high  level  inputs  we  can  find  another  trace  without  such  actions  but 
with  the  same  subsequence  of  low  level  actions. 

In  the  following  we  will  consider  this  algebraic  characterization  &s  the  defi¬ 
nition  of  NNI.  Indeed,  all  the  other  properties  we  present  below  in  this  section 
are  defined  using  this  compact  algebraic  style.  An  interesting  advantage  of  this 
style  is  that  it  reduces  the  check  of  a  security  property  to  the  “standard”  and 
well  studied  problem  of  checking  the  semantic  equivalence  of  two  LTSs. 

It  is  possible  to  prove  that  Access -Monitor  A  of  Example  1  is  NNI.  In  fact, 
the  next  example  shows  that  NNI  is  able  to  detect  whether  the  multilevel  access 
control  rules  are  implemented  correctly  in  the  monitor. 


is  missing  then  there  will  be  a  particular  execution  where  some  classified  infor¬ 
mation  is  disclosed  to  low  level  users  (otherwise  such  a  rule  would  be  useless). 
This  will  certainly  modify  the  low  level  view  of  the  system  making  it  not  NNI 
for  sure. 

However,  the  next  example  shows  that  NNI  is  not  adequate  to  deal  with 
synchronous  communications  and,  consequently,  it  is  too  weak  for  SPA  agents. 

Example  3.  Consider  Access  Monitor.  1,  and  suppose  that  we  want  to  add  a 
high  level  output  signal  which  informs  high  level  users  about  write  operations  of 
low  level  users  in  the  high  level  object.  This  could  be  useful  to  know  the  integrity 
of  high  level  .information.  We  obtain  the  VSPA  agent  of  Table  6  with  the  new 
written.up  action  and  where  Vi  €  {0, 1},  written.up(i)  e  ActH> 


Access  Monitor.  3  =f  (Monitor  .3  \  Object(l,  0)  |  Object(  0,  0))  \  L 

Monitor. 3  =f  access  jr(l,x). 

(if  x  <1  then 

r(x,  y).val(l,  y)MonitorJ3 
else 

val(I  err) Monitor. 3) 

+ 

access -w(ly  x).write(l}z). 

(if  x  =  l  then 

w(x,  z)  Monitor. 3 
else 

if  x  >  l  then 

ui(x,  z)  .written.up(z)  Monitor. 3 
else 

Monitor. 3) 


Table  6.  The  Access  Monitor. 3  System. 


It  is  possible  to  prove  that  Access  Monitor  _3  is  NNL  However,  consider  the 
following  trace  of  Access  Monitor  .3\ 

7  =  access. w(0,  l).write(0, 0). written j up(0). access ju;(0,  l).write(0, 0) 

where  a  low  level  user  writes  two  times  value  0  into  the  high  level  object.  If 
we  purge  7  of  high  level  actions  (i.e.  of  written.up)  we  obtain  the  following 
sequence: 

7X  =  access _u;(0,  l).write(0, 0). access _iu(0,  l).write(0y  0) 


Access -Monitor  A  d=  (Monitor  A  \  Object(  1, 0)  |  O&jec£(0,  0))  \  L 

Monitor  A  =f  accessjr(l,x). 

(if  x  <1  then 

r(x,  y).val(l,  y).MonitorA 
else 

val(l ,  err). Monitor  A) 

+ 

access  jw(l1x).write(l,z). 

(  if  x  >  l  then 

w(x,  z). Monitor  A 
else 

Monitor  A) 

+ 

hstop.O 


Table  7.  The  Access -Monitor  A  System. 


will  know  that  Access -Monitor  A  is  not  blocked  and  so  that  hstop  has  not  been 
executed  yet.  In  section  3.4  we  will  show  how  the  subtle  information  flow  caused 
by  a  potential  deadlock  can  be  exploited  in  order  to  construct  an  information 
channel  from  high  level  to  low  level.  ■ 

In  order  to  detect  this  kind  of  flows,  it  is  necessary  to  use  some  notion  of  equiva¬ 
lence  which  is  able  to  detect  deadlocks.  Note  that  by  simply  changing  the  equiv¬ 
alence  notion  in  the  definition  of  SNNI  we  obtain  a  security  property  which 
inherits  all  the  observation  power  of  the  new  equivalence  notion.  So,  for  detect¬ 
ing  deadlocks,  one  obvious  possibility  could  be  the  failure/testing  setting  [9,  52J, 
that  has  been  designed  for  this  purpose. 


3.2  Detecting  High  Level  Deadlocks  Through  Failure/Testing 
Equivalences 

Consider  the  version  of  SNNI  based  on  testing  equivalence: 

Definition  10.  (testing  SNNI)  E  £  TSNNI  E/Actn  E\  Actn  ■ 

We  have  that  Access -Monitor  A  £  TSNNI.  In  fact  it  is  sufficient  to  consider 
the  test  T  =f  access_r(0, 0).u>.0  which  is  able  to  detect  the  deadlock  introduced 
by  action  hstop .  In  particular,  we  have  that: 


Access -Monitor  A  \  Actn  must  T 


which  represents  the  backup  timer  and  sends  periodically  a  signal  in  order  to 
obtain  a  backup.  It  is  an  abstraction  of  a  clock,  since  in  SPA  it  is  not  possible 
to  handle  time  directly. 

Backup-timer  =f  backup.  Backup  Aimer 

Then  we  slightly  modify  the  Monitor  process  by  inserting  two  actions  which 
suspend  its  execution  until  the  backup  is  finished. 

Monitor  S  . . . 

the  same  as  in  Access -Monitor  A 


+  start  -backup,  end  Jbackup.  Monitor 

The  backup  process  is  enabled  by  the  timer,  then  it  stops  the  monitor,  reads 
the  values  of  variables,  stores  them  into  two  additional  objects  (Object(2,  y)  and 
Object(3,y))  and  resumes  the  monitor: 

Backup  backup. 

start-backup. 

r(0,y).r(l,z). 

w(2,y).w(3,z). 

end-backup. 

Backup 

The  access  monitor  with  backup  is  given  by  the  following  system: 

Access-MonitorJB  d=  {Monitor -B  \  Backup-timer  |  Backup  |  Object{ 0,0)  | 

|  Object{  1,0)  |  Object (2, 0)  |  Object{3, 0))  \  L 

where  L  =  {r,  w ,  start-backup ,  endJbackup ,  backup}.  As  a  result,  the  backup  pro¬ 
cedure  of  the  system  is  something  internal,  i.e.,  an  external  user  can  see  nothing 
of  the  backup  task.  This  makes  the  system  divergent.  In  fact,  if  the  variable  val¬ 
ues  are  unchanged,  then  the  backup  procedure  is  a  t  loop  that  moves  the  system 
to  the  same  state  where  it  started  the  backup.  For  weak  bisimulation  this  is  not 
a  problem  and  we  can  analyze  this  new  system  as  well.  In  particular,  we  can 
check  with  the  CoSeC  tool  (presented  in  Section  4)  that  Access -Monitor  _ B  is 
observationally  equivalent  to  Access -Monitor  A.  This  is  enough  to  prove  (The¬ 
orem  5)  that  every  security  analysis  made  on  Access -Monitor  A  is  valid  also  for 
Access-MonitorJB.  In  particular,  Access -Monitors  is  not  secure  because  of 
the  potential  high  level  deadlock  we  have  explicitly  added  in  Access -Monitor  A. 

On  the  other  hand,  if  we  try  to  analyze  this  system  with  some  testing  equiv¬ 
alence  based  property,  we  have  an  inaccurate  result.  Indeed  Access -Monitors, 
differently  from  Access -Monitor  A,  is  TSNNI.  This  happens  because  process 
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Fig.  8.  BNDC  intuition 


Proof.  It  immediately  follows  from  the  fact  that  E  F  implies  E  F .  The 

inclusions  are  proper  because  E  —  r.LO  +  r.h.l. 0  is  such  that  E  €  NNI,  SNNI 
and  E  <£  BNNI ,  BSNNI.  ■ 

Consider  again  Access -Monitor -A  containing  the  hstop  event.  It  is  neither 
BSNNI  nor  BNNI,  as  observational  equivalence  is  able  to  detect  deadlocks.  In 
particular,  Acces s -Monitor A /Act h  can  move  to  0  through  an  internal  action 
t,  while  Access -Monitor  A\  Act  h  is  not  able  to  reach  (in  zero  or  more  t  steps) 
a  state  equivalent  to  0. 

Now  we  want  to  show  that  BSNNI  and  BNNI  are  still  not  able  to  detect  some 
potential  deadlocks  due  to  high  level  activities.  This  will  induce  us  to  propose  an¬ 
other  property  based  on  a  different  intuition.  Let  us  consider  Access -Monitor  A. 
We  can  prove  that  such  a  system  is  BSNNI  as  well  as  BNNI .  However,  the  fol¬ 
lowing  two  dangerous  situations  are  possible:  (i)  a  high  level  user  makes  a  read 
request  without  accepting  the  corresponding  output  from  the  monitor  (remem¬ 
ber  that  communications  in  SPA  are  synchronous)  and  ( ii )  a  high  level  user 
makes  a  write  request  and  does  not  send  the  value  to  be  written.  In  both  cases 
we  have  a  deadlock  due  to  a  high  level  activity  that  BNNI  and  BSNNI  are  not 
able  to  reveal.  To  solve  this  problem,  we  are  going  to  present  a  stronger  prop¬ 
erty,  called  Bisimulation-based  Non  Deducibility  on  Compositions  (BNDC,  for 
short).  It  is  simply  based  on  the  idea  of  checking  the  system  against  all  high 
level  potential  interactions.  A  system  E  is  BNDC  if  for  every  high  level  process 
II  a  low  level  user  cannot  distinguish  E  from  ( E\TI )  \  Actfj .  In  other  words,  a 
system  E  is  BNDC  if  what  a  low  level  user  sees  of  the  system  is  not  modified 
by  composing  any  high  level  process  77  to  E  (see  Figure  8). 

Definition  12.  E  is  BNDC  iff  V77  G  Eh,  E/ActH  »b  (E  |  77)  \  ActH ■  ■ 


Access -Monitor^  =  (AM  |  Inter f)  \  N 

AM  =f  (Monitor _5  |  Object(  1,0)  |  Object(0, 0))  \  L 

Monitors  access _r (i,  x). 

(  if  x  <  J  then 

r(x,  y).A/omior_5 

else 

uai(i,  err)  -  Monitor _5) 

+ 

accessjw(l,  x,z). 

(  if  x  >  l  then 

w(x,  z). Monitors 
else 

Monitors) 

Inter f  =f  Inter f( 0)  |  Interf(l) 

Interf(l)  ajr(l1x).accessy(l,x).val(l ,  k).put(lyk) .Inter f(l) 

+ 

aju/(Z,  z).access_tu(/,  x,  z).  inter /(i) 


Table  8.  The  Access -Monitors  System. 


Fig.  10.  The  inclusion  diagram  for  bisimulation-based  properties 


Figure  10  summarizes  the  relations  among  the  bisimulation-based  properties 
presented  so  far. 

It  is  now  interesting  to  study  the  trace  equivalence  version  of  BNDC  called 
NDC.  Indeed  it  could  improve  the  SNNI  property  which  is  still  our  better  pro¬ 
posal  for  the  trace  equivalence  setting  (no  detection  of  deadlocks).  Surprisingly, 
we  find  out  that  such  property  is  exactly  equal  to  SNNI. 

Theorem  2.  NDC  =  SNNI . 

Proof.  We  first  prove  that  if  E  G  NDC  then  E  G  SNNI.  By  hypothesis, 
E/ActH  ( E\0)\Actn  for  the  specific  II  =  0.  Since  (£|0)\  .Act  h  E\Actn 
then  we  have  Ej  Act  u  ~r  E  \  ActH- 

Now  we  want  to  prove  that  if  E  G  SNNI  then  E  G  NDC.  By  hypothesis, 
E/ActH  «r  E  \  ActH .  Since  T(E  \  ActH)  C  T((E\II)  \  ActH)  then  we  have 
T(E / Act h)  C  T((E\II)  \  ActH).  Observe  also  that  the  reverse  inclusion  holds, 
in  fact,  if  E  and  II  synchronize  on  a  certain  high  action,  then  Ej  Act  h  can  always 
“hide”  it.  Hence  J E/ActH  ~r  ((^1^)  \  H)/Actn •  * 

This,  in  a  sense,  confirms  that  the  intuition  behind  SNNI  (and  so  NI)  is  good 
at  least  in  a  trace  equivalence  model.  Indeed,  in  such  a  model  the  fact  that  we 
check  the  system  against  every  high  level  process  is  useless.  It  is  sufficient  to 
statically  check  if  the  hiding  of  high  level  actions  corresponds  to  the  restric¬ 
tion  of  such  actions.  This  points  out  a  critical  point:  BNDC  is  difficult  to  use  in 
practice,  because  of  the  universal  quantification  on  high  level  processes.  It  would 
be  desirable  to  have  an  alternative  formulation  of  BNDC  which  avoids  univer¬ 
sal  quantification,  exploiting  local  information  only  as  for  the  trace-equivalence 
case;  even  if  Martinelli  has  shown  that  BNDC  is  decidable  over  finite  state  pro¬ 
cesses  [47],  a  solution  to  this  problem  is  still  to  be  found.  In  the  same  work, 
Martinelli  also  shows  a  negative  fact  regarding  the  verification  of  BNDC:  it  is 
not  compositional,  i.e.,  if  two  systems  are  BNDC  their  composition  may  be  not 
so.  This  does  not  permit  us  to  reduce  the  BNDC- security  of  a  big  system  to  the 
BND C-security  of  its  simpler  subsystems  and  forces  us  to  always  prove  BNDC 
over  the  whole  system. 


(E"\F")/ActH  which  is  simulated  by  S' / Act n\Ff / Act h  ™  S" / Act h|S" / Act h- 


Now  we  can  prove  the  Theorem. 

(i)  Consider  the  relation  ((E'\Ff)  \  ActH,Ef  \  Actn\F '  \  Actu)  6  R  for  all 
S',  F'  such  that  E  =>  E'  and  F  =>  S'.  If  we  prove  that  R  is  a  weak  bisimulation 
up  to  then,  by  hypothesis  and  Lemma  1,  we  obtain  the  thesis.  We  consider 
the  only  non  trivial  case:  (S'|S')  \  Actu  ( S"|S ")  \  Actu  with  S'  E " 
and  S'  F" .  Since  E'/ActH  Eff/ActH  and,  by  hypothesis,  Ef  £  BSNNI , 
we  have  that  3#"'  such  that  S'  \  Actu  S'"  \  ActH  and  E" /Act H  ~b 
S'"\  Ac£h;  finally,  by  hypothesis,  E"  £  BSNNI ,  hence  we  obtain  S'"\  Ac£h 
S"  \  Actu-  Repeating  the  same  procedure  for  S'  we  have  3S'" ,  Fnt  such  that 

S'  \  ActH\F '  \  Ac£h  =£>  S'"  \  Actu | S'"  \  ~b  E"  \  ActH\F "  \  Actn-  Since 

((S"|S")  \  Ac£h5  S"  \  Actn|  S"  \  Actn)  6  S,  then  S  is  a  bisimulation  up  to 

~B- 

(zz)  Consider  the  following  relation  ((E'/Actu)  \  L,  (S'  \  L)/Actn)  €  R,  for  all 
S'  such  that  S  S'  and  for  all  L  C  £.  If  we  prove  that  JR  is  a  bisimulation 
up  to  «h  then,  by  applying  hypothesis  and  observing  that  (S'  \  L)  \  Actu 
(S'\  Actn)\L,  we  obtain  the  thesis.  The  only  non  trivial  case  is  (S' /Act h)\L 
(S" /Act h)  \  L  with  S'  A  S"  and  h  £  Actu*  By  hypothesis,  S'  £  BSNNI 
hence  we  have  that  (E'/Actu)  \L  &b  (E'  \Actu)  \  L  and  so  3S'"  such  that 
(S'  \  ActH)  \L=U  (S'" \  Actu)  \ L  ( Ef,,/ActH )  \ L  and  (S'"  \  ActH )  \L^B 

(S"/Ac£h)\L.  Obviously,  we  also  have  that  ( E'\L)\Actu  ( E"'\L)\Actu 

and  so  ( Ef\L)/ActH  ==>■  (E'"\L)/Actu-  We  briefly  summarize  the  proof:  we  had 
the  synchronization  (S' / Act u)  \L  (S" /Act h)  \  L  and  we  proved  that  there 

exists  E'"  such  that  (E'  \  L)/ActH  (£'"  \  L)/ActH  and  {E'"/ActH)  \L~B 
(E"/Actfj)  \  Since  ((E"f /Actu)  \  L,  (S'"  \  L)/Actu)  €  R  then  JR  is  a  weak 
bisimulation  up  to  ■ 

It  is  worthwhile  noticing  that  SBSNNI  was  not  the  first  sufficient  condition 
proposed  for  BNDC.  In  [24]  we  introduced  a  property  stronger  than  SBSNNI , 
but  nevertheless  quite  intuitive,  called  Strong  BNDC  (SBNDC).  This  property 
just  requires  that  before  and  after  every  high  step,  the  system  appears  to  be 
the  same,  from  a  low  level  perspective.  More  formally  we  have  the  following 
definition. 

Definition  14.  (SBNDC:  Strong  BNDC) 

A  system  E  £  SBNDC  if  and  only  if  WE'  reachable  from  E  and  VS"  such  that 
E'  A  E"  for  h  6  ActH,  then  E'  \  ActH  ~b  E"  \  ActH 

We  now  prove  that  SBNDC  is  strictly  stronger  than  SBSNNI.  To  this  purpose, 
we  need  the  following  Lemma 

Lemma  2.  S  £  BNDC  &  E  \  Actu  ~b  (S| 77)  \  Actu  for  all  II  £  £h. 

Proof.  It  follows  immediately  from  Theorem  l.(zz)  and  Definition  12.  ■ 


Fig.  11.  The  inclusion  diagram  for  trace-based  and  bisimulation- based  properties. 


Proof.  It  derives  from  the  definition  of  the  security  properties,  observing  that 
trace  and  bisimulation  equivalences  are  congruences  with  respect  to  _\L  and  _|_ 
operators  of  CCS  [50] .  It  is  possible  to  prove  that  they  are  also  congruence  with 
with  respect  to  ~\i  L  and  _/L  operators  of  SPA.  For  trace  equivalence  the  proof 
is  trivial.  In  the  following  we  prove  the  closure  of  weak  bisimulation  equivalence 
w.r.t.  _  \j  L  and  leave  to  the  reader  the  other  similar  case  of  _/L.  Let  E  F; 
we  want  to  prove  that  E\i  L  F\j  L.  Consider  a  bisimulation  R  such  that 
(F,F)  e  R ]  then  define  the  relation  P  as  follows:  ( Er  \j  L,F'  \j  L)  G  P  if  and 
only  if  (F',  F')  G  R.  P  is  a  bisimulation  too,  in  fact  if  E*  \j  L  A  E "  \/  L  then 
\i  #  L  D  /  and  E '  A  E" .  So  3 F"  such  that  F'  F";  since  [i  g  L  fl  /  then 
F'  \j  L  ==>  F'f  \i  L  (and  vice  versa  for  Ff  \j  L  F"  \j  L).  ■ 


3.4  Building  Channels  by  Exploiting  Deadlocks 

In  Example  6  we  have  seen  that  Access  Monitor  A  is  not  BNDC  because  of 
potential  high  level  deadlocks.  We  said  that  a  deadlock  due  to  high  level  activ¬ 
ity  is  visible  from  low  level  users,  hence  it  gives  some  information  about  high 
level  actions,  and  cannot  be  allowed.  However,  one  could  doubt  that  a  high  level 
deadlock  is  really  dangerous  and,  in  particular,  that  it  can  be  exploited  to  trans¬ 
mit  information  from  high  to  low.  We  demonstrate  that  it  is  indeed  the  case  by 
simply  showing  that  it  is  possible  to  build  a  1-bit  channel  from  high  to  low  level 
using  systems  which  contain  high  level  deadlocks.  In  particular  we  obtain  a  1-bit 
channel  with  some  initial  noise  (before  the  beginning  of  the  transmission),  using 
three  processes  with  high  level  deadlocks  composed  with  other  secure  systems. 


Channel  *==f  [Ch.write{ 0)  |  Ch.start( 0)  |  AM(0)  J  AM(1)  \  AM( 2)  | 
\R\T)\N 

AM(n)  =f  Access  jrfonitor.l[check(n)/access.r(0,0), 
block(n)/ access  jw(  1, 1), 

unblock(n) /write(  1, 0)]  \  {access .r,  access  jw,  write 
Ch.write(x)  *=f  send.write.Ch.write(l ) 

+ 

dear  .write.  Ch.write(0) 

+ 

if  x  =  1  then 

upjwrite.  Ch  .write  (0) 

C/i_start(0)  =f  C/i-turzte(0)[send^start/sendjujrite,  up. start /up. write, 
dear^start  J  dear. write] 

R  ^  send.write.R 

+ 

check(2)  .sendstart. 

(check(0). output  (0).R 

+ 

check(X).  output  (l).R) 

T  bl  ock  (2)  .dear  .write ,  up  .write,  block  (0)  Mock  (1 )  .T1 

T1  dear. start. unblock(2). upstart. block(2). dear. write, 

input  ( y ) .  unbl  ock  (y  ) .  up. write .  block  (y )  .T 1 


Table  10.  The  Channel  process  which  exploits  deadlocks 


3.5  Comparison  with  Related  Work 

The  use  of  process  algebras  to  formalize  information  flow  security  properties 
is  not  new.  In  [57]  it  is  possible  to  find  a  definition  of  Non  Interference  given 
on  CSP  [39].  It  looks  like  SNNI  with  some  side  conditions  on  acceptable  low 
level  actions.  This  definition  is  recalled  in  [4],  where  a  comparison  with  another 
information  flow  property  is  reported. 

More  recent  results  based  on  the  CSP  model  are  contained  in  [56],  where  the 
authors  introduce  some  information  flow  security  properties  based  on  the  notion 
of  deterministic  views  and  show  how  to  automatically  verify  them  using  the  CSP 
model  checker  FDR  [55]. 

The  most  interesting  property  is  lazy  security  ( L-Sec )  which,  however,  re¬ 
quires  the  absence  of  non- determinism  in  the  low  view  of  the  system  (i.e.,  when 
hiding  high  actions  through  interleaving)  and  for  this  reason  we  think  it  could 
be  too  restrictive  in  a  concurrent  environment.  For  example,  all  the  low  non- 
deterministic  systems  -  such  as  E  =  l.l\  -f  I.I2  —  are  considered  not  secure.  In 
this  section  we  compare  those  properties  with  ours  using  a  failure-equivalence 
version  of  BNDC ,  called  FNDC  (see  also  [18]  for  more  details).  The  main  re¬ 
sult  is  that  BNDC  restricted  to  the  class  of  low-deterministic  and  non-divergent 
processes  is  equal  to  L-Sec. 

Here  we  give  a  definition  of  failure  equivalence  which  does  not  correspond 
completely  to  the  original  one  [39],  Indeed,  it  does  not  consider  possible  diver¬ 
gences  but  this  is  not  a  problem  since  our  comparison  will  focus  on  the  class  of 
non-divergent  processes.  We  prefer  this  definition  because  it  is  very  simple  and 
is  implied  by  . 

We  need  some  simple  additional  notations.  We  write  E  to  indicate  that 

K  /* 

$E'  such  that  E  E'  and  E  &  with  K  C  C  stands  for  V/z  e  K,  E 

Definition  15.  If  7  €  T(E)  and  if  after  executing  7,  E  can  refuse  all  the 
actions  in  set  X  C  C,  then  we  say  that  the  pair  (7,  X)  is  a  failure  of  the  process 
E.  Formally  we  have  that: 

failures(E)  d=  {(7,X)  C  C*  x  F(C)  |  3 E’  such  that 

E  ==>■  Ef  and  Er  ^ } 

When  failures(E)  =  failures{F)  we  write  E&F  F  (failure  equivalence).  ■ 

We  identify  a  process  E  with  its  failure  set.  So  if  (7,  X)  6  failures{E)  we  write 
(7,  X)  e  E.  Note  that  7  €  T{E)  if  and  only  if  (7, 0)  C  E.  So  E  «F  F  implies 
E  E. 

We  also  have  that  E  F  implies  E  E: 

Proposition  8.  E  ~b  E  implies  E  E. 

Proof.  Consider  E  E  and  (7,  X)  e  E.  We  want  to  prove  that  (7,X)  G  F. 

'Y  X 

Since  (7,  X)  €  E  we  have  that  3 E*  such  that  E  =4-  Ef  and  E'  By  definition 


A,  with  C(E)/A  as  inverse  (the  same  holds  for  B/£(F)  and  C{F)/B).  This  ex¬ 
pression  means  that  the  actions  in  E  and  F  are  first  relabelled  using  the  two 
disjoint  sets  A  and  B ,  then  interleaved  (no  communication  is  possible)  and  finally 
renamed  to  their  original  labels. 

Recall  that  a  process  is  divergent  if  it  can  execute  an  infinite  sequence  of 
internal  actions  r.  As  an  example  consider  the  agent  A  =f  r.A  +  6.0  which  can 
execute  an  arbitrary  number  of  r  actions.  We  define  Nondiv  as  the  set  of  all  the 
non-divergent  processes. 

We  can  now  present  the  lazy  security  property  [56].  This  property  implies 
that  the  obscuring  of  high  level  actions  by  interleaving  does  not  introduce  any 
non-determinism.  The  obscuring  of  high  level  actions  of  process  E  by  interleaving 

is  obtained  considering  process  E\\\RUN  fj  where  RUN fj  =f  h.RUN fj. 

In  such  a  process  an  outside  observer  is  not  able  to  tell  if  a  certain  high  level 
action  comes  from  E  or  from  RUN fj. 

L-Sec  also  requires  that  E\\\RUN H  is  non-divergent.  9  This  is  equivalent  to 
requiring  that  E  is  non-divergent,  because  RUN H  is  non-divergent  and  the  ||| 
operator  does  not  allow  synchronizations  (which  could  generate  new  r  actions). 

Definition  17.  E  G  L-Sec  E\\\RUN  h  £  Bet  D  Nondiv.  ■ 

In  the  following  we  want  to  show  that  L-Sec  can  only  analyze  systems  which  are 
low- deterministic,  i.e.,  where  after  any  low  level  trace  7  no  low  level  action  l  can 
be  both  accepted  and  refused.  The  low-determinism  requirement  is  not  strictly 
necessary  to  avoid  information  flows  from  high  to  low  level.  So,  in  some  cases, 
L-Sec  is  too  strong.  As  an  example  consider  the  following  non-deterministic 
system  without  high  level  actions:  E  =f  U'.O -h  U".0.  It  is  obviously  secure  but 
it  is  not  low-deterministic  and  so  it  is  not  L-Sec.  Formally  we  have  that: 

Definition  18.  E  is  low- deterministic  (E  G  Lowdet)  iff  E\  Actfj  G  Det.  ■ 
The  following  holds: 

Theorem  6.  L-Sec  C  Lowdet. 

PROOF.  Let  E  G  L-Sec .  Consider  a  trace  7a  of  E  \  Actfj  and  suppose  that 
(7,  {a})  e  E\ActH .  So  there  exists  E '  such  that  E\ActH  E'\ActH  and  such 

a 

that  E'  \  ActH  A  Since  RUN H  cannot  execute  the  low  level  action  a  then  we 

have  that  Ef \ \ \ R UN h  7 and  so  (7, {a})  G  E\\\RUN h  because  E\\\RUN h  ==> 
E'\\\RUN H-  Since  7 a  is  a  trace  for  E\  Actfj  then  it  is  also  a  trace  for  E\\\RUN h 
and  we  obtain  that  E\\\RUN h  is  not  deterministic,  contradicting  the  hypothesis. 
So  (7,  {a})  ^  E  \  Actn  and  E  G  Lowdet.  ■ 


9  Note  that  in  [56]  the  non-divergence  requirement  is  inside  the  deterministic  one.  This 
is  because  the  authors  use  the  failure-divergence  semantics  [10].  In  this  work  we  use 
the  failure  equivalence  which  does  not  deal  with  divergences.  So,  in  order  to  obtain 
exactly  the  L-Sec  property,  we  require  the  non-divergence  condition  explicitly. 


traces (E' /Act h).  Since  we  have  that  E  \  Acta  ==4-  Ef  \  Actn  then  (E\II)  \ 
Actn  ==^  (Ef\H)  \  Actn  and  so  (7,  K)  £  ( E\II )  \  Actff. 

The  inclusion  is  strict  because  agent  E  ^  l.h.l. 0  4-  70  +  LI. 0  is  FNDC  but  not 
SFSNNI. 

(FNDC  C  FSNNI)  It  is  sufficient  to  consider  II  =  0.  We  have  that  (77|0)  \ 
Actfi  E\Actn  and  so,  since  (J57|0)  \Actn  E/Actfj  we  have  E/Actu  ~f 
E\Actn- 

The  inclusion  is  strict  because  agent  E  =f  l.h.l. h'. 1. 0 +  1.0  +  1. LI. 0  is  FSNNI  but 
not  FNDC.  M 

Figure  14  summarizes  the  inclusions  among  the  presented  security  properties. 
It  can  be  drawn  using  the  previous  inclusion  results  and  the  following  remarks: 
BNDC  %  SFSNNI ,  in  fact  agent  Lh.l.Q  + 1.0  +  l.l.O  is  BNDC  but  not  SFSNNI ; 
we  also  have  that  BSNNI  %  FNDC  because  of  agent  h.l.hLLO  4-  770;  finally 
SFSNNI  %  BSNNI  because  of  agent  h.L(V. 0  +  l". 0)  +  77.0  +  l.V.O. 

The  next  theorem  shows  that  under  the  low-determinism  assumption  the 
properties  SFSNNI  and  FNDC  collapse  into  the  same  one.  We  need  the  following 
Lemma. 

Lemma  3.  If  E,E  £  Det,  E  E' ,  E  ==»  Ef  and  E  E  then  E '  EL 

Proof.  We  prove  that  if  (£,  K)  £  E'  then  (d,  K)  £  EL  Let  (S,  K)  £  EL  Then 
(7$,  K )  £  E  and  by  E  «F  E  we  obtain  that  (7^,  K)  £  E.  So  3 Eu ,  E,,f  such  that 

E  =*►  E "  Ef"  hence  (S,K)  £  E'L  Since  E  £  Det  then  by  Proposition  9 

and  hypothesis  we  have  that  E"  ^F  Ef  and  so  (5,  K)  £  EL  We  can  prove  in  the 
same  way  that  if  (6,  K)  £  Ef  then  (5,  K)  £  EL  So  E'  E'  ■ 

Theorem  8.  FNDC  n  Lowdet  C  SFSNNI. 

Proof.  Since  FNDC  C  FSNNI  and  E  £  FNDC,  we  have  that  E  \  Actfj  &F 
E/Actn •  By  E  £  Lowdet  we  obtain  E/Actn  6  Det.  Now  consider  E  EL 
We  have  to  prove  that  E' /Act h  ~f  E'\  Actn •  Let  TT'  be  the  high  level  process 
which  executes  exactly  the  complement  of  the  high  level  projection  of  7,  i.e. 
the  complement  of  the  subsequence  of  7  composed  by  all  the  high  level  actions 

V 

in  7.  If  7'  is  the  low  level  projection  of  7  we  have  that  (E\IIr)  \  Actn  =► 

(JE7'|0)  \  Actfj  E'  \  Actn •  Since  E  ==>  Ef  then  E/Actn  ==>■  E( /Actu-  By 
hypothesis  we  have  that  (E\IIf)\ActH  «f  E/Actu •  Since  E/Actfj  £  Det  then, 
by  Lemma  3,  we  have  that  E' / Actn  ~f  (77' |0)  \  Actn  ~f  Er\  Actn •  ■ 

Corollary  2.  FNDC  0  Lowdet  —  SFSNNI  n  Lowdet. 

Proof.  Trivial  by  Theorems  8  and  7.  ■ 


Comparison  We  now  show  that  under  the  low-determinism  and  the  non¬ 
divergence  assumption  the  BNDC  property  is  equal  to  L~Sec.  We  start  proving 
this  result  for  FNDC. 


BNDC 


Fig.  15.  Relations  among  properties. 


contradict  the  fact  that  E  €  L-Sec  and  so  E"/Actn  7^,  Va  €  K.  Hence  (<S,  K)  € 
E'/Actn •  ■ 

Theorem  10.  SFSNNI O  Lowdet  O  Nondiv  C  L-Sec. 

Proof.  Let  E  e  SFSNNI  PI  Lowdet  fl  Nondiv  and  7 a  be  a  trace  for  process 
E\\\RUNh.  We  want  to  prove  that  (7,  {a})  £  E\\\RUN H •  It  trivially  holds  if 
a  €  Acttf  because  in  such  a  case  it  can  always  be  executed  by  RUN h «  So  let 

a 

a  e  ActL.  Suppose  E\\\RUN  n  Er\\\RUNH  &  and  consider  the  sequence  7' 

obtained  removing  all  the  high  level  actions  from  7.  Then  E/AdH  ==>  E' /Act n 
and  by  hypothesis  E'/Actn  E'\ActH .  Since  E'\\\RUN h  &  then  E'\ActH  & 

and  so  E'/Actn  ^  and  (7',  {a})  e  E/ActH-  Since  E  €  SFSNNI  we  obtain  that 
(7',  {a})  e  E\ActH .  Now  7 a  is  a  trace  for  E\\\RUN H  and  so  7'a  must  be  a  trace 
for  E/Actn  this  means  that  7 'a  is  also  a  trace  for  E\  ActH-  Since  E  €  Lowdet 
then  E\Actn  is  deterministic.  However  we  found  that  7 'a  is  a  trace  for  E\Actn 
and  (7',  {a})  e  E\Actn  obtaining  a  contradiction.  So  E'\\\RUN n  cannot  refuse 
a  and  (7,  {a})  g  E\\\RUNn*  Hence  E\\\RUN H  6  Det  and  since  E  e  Nondiv  we 
also  have  that  E\\\RUN n  £  Nondiv  ■ 

Corollary  3.  SFSNNI  H  Lowdet  fl  Nondiv  =  L-Sec. 

Proof.  By  Theorems  6  and  9  and  by  Definition  17  we  find  that  L-Sec  C  SFSNNI 
n  Lowdet  fl  Nondiv .  Finally  by  Theorem  10  we  obtain  the  thesis.  ■ 

Note  that  by  Corollary  2  we  also  have  that  FNDC  H  Lowdet  0  Nondiv  =  L-Sec. 
Now  we  show  that  this  result  also  holds  for  SBSNNI  and  BNDC.  We  first  prove 
that  for  deterministic  processes  becomes  equal  to  «b* 


Fig.  16.  The  inclusion  diagram  for  trace-based  security  properties. 


A  good  reference  about  modelling  NI  in  CSP  can  be  found  in  this  volume 
[58].  In  such  a  paper,  a  new  notion  based  on  power  bisimulation  is  also  proposed. 
We  intend  to  compare  it  with  our  bisimulation-based  properties  with  the  aim 
making  our  classification  as  much  complete  as  possible. 


3.6  Other  Security  Properties 

In  [24]  we  have  compared  our  properties  with  a  number  of  existing  proposal.  Here 
we  just  report  the  diagram  (Figure  16)  of  the  relations  among  such  properties  and 
we  give  the  bibliographic  references  to  them.  In  particular,  TNDI  derives  from 
Non  Deducibility  on  Inputs  [62],  lts-Correctability  comes  from  Correctability  [41], 
lts-FC  is  Forward- Correctability  [41],  Its-RES  is  a  version  of  Restrictiveness  [48]. 
Moreover  NNIIT  is  NNI  where  we  require  that  systems  are  input  total.  This 
means  that  in  every  state  the  system  must  be  able  to  accept  every  possible 
input  by  the  environment.  The  aim  of  this  (quite  restrictive)  condition  is  to 
prevent  users  from  deducing  anything  about  the  state  of  the  system,  even  if 
they  observe  the  inputs  the  system  accepts.  All  the  properties  included  in  NNIIT 
requires  this  condition  over  input  actions.  NDCIT  is  NDC  with  input  totality. 
Finally  NDC#  and  NDCmIT  are  parametric  versions  of  NDC  where  the  high 
level  user  can  exploit  only  the  actions  in  sets  H  and  M  to  communicate  with 
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Let  T[x,  y]  be  the  table  element  contained  in  row  x  and  column  y.  If  T[x,  y)  e  {=,  c} 
then  xT[x,  y]y.  If  T[x,  y\  ■=  n  then  the  agent  n  below  is  in  x  and  is  not  in  y.  In  the 
following  IJq  represents  the  Input-Total  empty  agent:  /70  =  o- 

!)  Z-  =  E<e/ i  Z  +  h.Z'  and  2"  =  j;  i.Z'  +  IM0 

2)  AJ.O  +  I.O  +  Z'.O 

3)  Z  —  E»e/  LZ  +  h.(77o  +  LI lo) 

4)  Z  =  E*e/  *■■2’  +  h.(n0  4-  Z.77o)  with  h  g  M  U  M 

5)  Z  =  +  l.Ilo  +  h.Ilo 

6)  Z  =  Eie/  * ^ +  h'.SS"  and  Z'  =  77o  +  h.Z'  +  i./7o  and 

^  r  4- h.77o  +  L/7o 

7)  fc.1.0  with  /i  £  H  U  H 

8)  h.i.O 

9 )  Z  —  Eie/ 1  ^  +  h.Z*  and  Z '  =  Etc/  * +  r./70  with  h  €  H  U  H 
10)  0 


Table  11.  The  inclusion  table  for  trace- based  security  properties. 


In  the  analysis  layer,  CoSeC  uses  a  routine  of  equivalence  and  one  of  mini¬ 
mization  that  belong  to  the  analysis  layer  of  CW.  These  are  a  slight  modifica¬ 
tion  of  the  algorithm  by  Kanellakis  and  Smolka  [42]  which  finds  a  bisimulation 
between  the  roots  of  two  finite  state  LTSs  by  partitioning  their  states.  It  is  inter¬ 
esting  to  note  that  a  simple  modification  of  this  algorithm  can  be  used  to  obtain 
the  minimization  of  a  finite  state  lts. 

4.2  Checking  the  Information  Flow  Properties 

Here  we  describe  in  details  how  the  verification  of  Information  Flow  Properties 
proceeds.  As  we  said  before,  we  have  properties  which  are  in  the  following  form: 

E  is  X-secure  if  and  only  if  CX\E]  ~  VX[E\. 

We  have  seen  for  SNNI  that  Cx[~~]  =  -  \  ActH  and  Vx\-]  =  -/ActH-  Hence, 
checking  the  X-security  of  E  is  reduced  to  the  “standard”  problem  of  checking 
semantic  equivalence  between  two  terms  having  E  as  a  sub-term. 

In  the  following  we  briefly  explain  how  the  system  works  in  evaluating  se¬ 
curity  predicates  NNI,  SNNI ,  NDC,  BNNI,  BSNNI}  SBSNNI \  and  we  discuss 
about  their  computational  complexity.  CoSeC  computes  the  value  of  these  pred¬ 
icates  over  finite  state  agents  (i.e.  agents  possessing  a  finite  state  LTS),  based  on 
the  definitions  given  in  Section  3  that  we  report  below  in  CoSeC  syntax  (for  ease 
of  parsing,  in  CoSeC  the  hiding  and  input  restriction  operators  are  represented 
by  !  and  ?,  respectively.): 

E  G  NNI  &  E\ActH  {E?ActH)\ActH 
E  €  SNNI  =  NDC  &  E\ActH  E\  ActH 

E  e  BNNI  E\ActH  (. E7ActH)\ActH 

E  €  BSNNI  E\ActH  &BE\  ActH 
E  £  SBSNNI  &  E'  €  BSNNI ,  VEf  reachable  from  E 

As  for  CW,  the  inner  computation  of  the  CoSeC  follows  three  main  phases. 

Phase  a)  (the  same  for  all  predicates)  CoSeC  builds  the  RLTSs  of  the  two  agents 
of  which  it  wants  to  compute  the  equivalence.  For  example  in  the  case  of  NNI, 
CoSeC  computes  the  transition  graph  for  ( E7ActH)y-ActH  and  ElActn.  In 
this  phase  we  do  not  have  any  particular  problem  with  complexity,  except 
for  the  intrinsic  exponential  explosion  in  the  number  of  states  of  the  RLTS 
due  to  parallel  composition. 

Phase  b)  (This  is  split  into  two  depending  on  the  semantics  requested  by  the 
security  predicate) 

bl:  (for  predicates  NNI,  SNNI,  NDC)  The  two  RLTSS  obtained  in  Phase 
a)  are  transformed  into  deterministic  graphs  following  the  classic  subset 
construction  (see  e.g.  [40]).  This  algorithm  has  exponential  complexity 
since  it  is  theoretically  possible,  in  the  deterministic  graph,  to  have  a 
node  for  every  subset  of  nodes  in  the  original  graph.  However,  experience 


Command:  nni  A 
true 

CoSeC  tells  us  that  A  is  NNI  secure.  Now  we  can  check  if  agent  A  is  SNNI  secure 
too: 


Command :  snni  A 
false 

So  A  is  NNI  secure  but  is  not  SNNI  secure  If  we  want  to  know  why  such  a 
system  is  not  SNNI  we  can  use  the  debugging  version  of  the  SNNI: 

Command:  djsnni  A 
false 

Agent  AlActH 

cam  perform  action  sequence  ’1 

which  agent  A\ActH 

cannot 

The  tool  shows  a  (low  level)  trace  which  distinguishes  processes  A /Acta  and 
A  \  A&h  .  The  trace  is  l  which  can  be  executed  only  by  the  first  one.  This  can 
be  useful  to  understand  why  a  process  is  not  secure.  Finally  the  command  quit 
causes  an  exit  to  the  shell. 

4.4  An  Example:  Checking  the  Access  Monitor 

In  this  Section  we  use  CoSeC  to  automatically  check  all  the  versions  of  the 
access  monitor  discussed  in  Example  6.  Since  CoSeC  works  on  SPA  agents  we 
have  to  translate  all  the  VSPA  specifications  into  SPA.  Consider  once  more 
Access -Monitor  _1.  Table  12  reports  the  translation  of  Access -Monitor  _1  spec¬ 
ification  into  the  CoSeC  syntax  for  SPA.  12  It  has  been  used  a  new  command 
basi  which  binds  a  set  of  actions  to  an  identifier.  Moreover,  the  \  character  at 
the  end  of  a  line  does  not  represent  the  restriction  operator,  but  is  the  special 
character  that  permits  to  break  in  more  lines  the  description  of  long  agents  and 
long  action  lists. 

We  can  write  to  a  file  the  contents  of  Table  12  and  load  it,  in  CoSeC,  with 
command  if  <filename>.  Now  we  can  check  that  Access  -Monitor -1  satisfies  all 
the  security  properties  except  SBSNNI  using  the  following  command  lines: 

Command :  bnni  Access -Monitor -l 
true 

Command:  bsnni  Access  -Monitor  A 
true 

Command:  sbsnni  Access -Monitor -l 

false  :  ('val_hl. Monitor  |  Object_ll  |  Object_hl)  \  L 

12  In  the  translation,  we  use  values  {/,  A)  in  place  of  {0, 1}  for  the  levels  of  users  and 
objects  in  order  to  make  the  SPA  specification  clearer.  As  an  example  access_r(l,  0) 
becomes  access_r_hl. 


Note  that  when  CoSeC  fails  to  verify  SBSNNI  on  a  process  E}  it  gives  as  output 
an  agent  E'  which  is  reachable  from  E  and  is  not  BSNNI. 

So  we  have  found  that 

Access -Monitor  A  G  BSNNI ,  BNNI 


but 


Access  .Monitor  A  SBSNNI 


Since  we  have  that  SBSNNI  C  BN  DC  C  BSNNI ,  BNNI,  we  cannot  conclude 
whether  Access  .Monitor  A  is  BNDC  or  not.  However,  using  the  output  state 
E'  of  the  SBSNNI  verification,  it  is  easy  to  find  a  high  level  process  77  which 
can  block  the  monitor.  Indeed,  in  the  state  given  as  output  by  SBSNNI ,  the 
monitor  is  waiting  for  the  high  level  action  'val_hl;  so,  if  we  find  a  process 
77  which  moves  the  system  to  such  a  state  and  does  not  execute  the  valJil 
action,  we  will  have  a  high  level  process  able  to  block  the  monitor.  It  is  sufficient 
to  consider  77  =  '  access  .r.hh.  0.  Agent  ( Access  .Monitor  A\  77)  \  ActH  will  be 
blocked  immediately  after  the  execution  of  the  read  request  by  77,  moving  to 
the  following  deadlock  state: 


(('valJiO. Monitor  |  0bject_10  |  Object _h0)  \  L  |  0)  \  ActH 

(this  state  differs  from  the  one  given  as  output  by  SBSNNI  only  for  the  values 
stored  in  objects).  It  is  possible  to  verify  that  Access -Monitor  A  g  BNDC  by 
checking  that  (Access .Monitor A  |77)  \  ActH  Access.MonitorA/ActH  using 
the  following  command: 

Command:  bi  Pi  ' access.rJih\ ) 

Command:  eq 

Agent :  (Access .Monitor  A  |  Pi)  \  acth 
Agent :  Access -Monitor  A  !  acth 
false 


As  we  said  in  Example  6,  such  a  deadlock  is  caused  by  synchronous  communi¬ 
cations  in  SPA.  Moreover,  using  the  CoSeC  output  again,  we  can  find  out  that 
also  the  high  level  process  77'  =  'access.w.hU)  can  block  Access -Monitor  A, 
because  it  executes  a  write  request  and  does  not  send  the  corresponding  value. 
Hence,  in  Example  6  we  proposed  the  modified  system  Access -Monitor. 5  with 
an  interface  for  each  level  and  atomic  actions  for  write  request  and  value  sending. 
We  finally  check  that  this  version  of  the  monitor  is  SBSNNI ,  hence  BNDC  too: 

Command:  sbsnni  Access  .Monitor  _5 
true 


4.5  State  Explosion  and  Compositionality 

In  this  section  we  show  how  the  parallel  composition  operator  can  increase  ex¬ 
ponentially  the  number  of  states  of  the  system,  and  then  how  it  can  slow  down 


-  E,  E'  €  P  =>  E\E'  €  P 

-  EeP,LCC=^E\LeP 

-  E  €  P,Zd=  E  Z  e  P 

and  let  AP  be  a  decision  algorithm  which  checks  if  a  certain  agent  E  e  Sfs 
belongs  to  P;  in  other  words,  AP{E )  =  true  ifE  6  P,  AP{E)  =  false  otherwise. 
Then  we  can  define  a  compositional  algorithm  A'p(E )  in  the  following  way: 

1)  if  E  is  of  the  form  E'  \  L,  then  compute  A'P(E');  if  A'P(E')  =  true  then 
return  true,  else  return  the  result  of  AP(E); 

2)  if  E  is  of  the  form  Ei\E2,  then  compute  AP(Ei)  and  A'P(E2);  if  A'P(Ei)  = 
A'P(E2)  =  true  then  return  true,  else  return  the  result  of  AP(E); 

3)  if  E  is  a  constant  Z  with  Z  d=  E' ,  then  return  the  result  of  A'P{E'); 

4)  if  E  is  not  in  any  of  the  three  forms  above,  then  return  AP(E).  ■ 

The  compositional  algorithm  A’P(E)  works  as  the  given  algorithm  AP(E)  when 
the  outermost  operator  of  E  is  neither  the  restriction  operator,  nor  the  parallel 
one,  nor  a  constant  definition.  Otherwise,  it  applies  componentwise  to  the  ar¬ 
guments  of  the  outermost  operator;  if  the  property  does  not  hold  for  them,  we 
cannot  conclude  that  the  whole  system  is  not  secure,  and  we  need  to  check  it 
with  the  given  algorithm. 

Note  that  the  compositional  algorithm  exploits  the  assumption  that  property 
P  is  closed  with  respect  to  restriction  and  uses  this  in  step  1.  This  could  seem 
of  little  practical  use,  as  the  dimension  of  the  state  space  for,  let  say,  E  is  often 
bigger  than  that  of  E  \  L.  However,  parallel  composition  is  often  used  in  the 
form  (A|B)  \  L  in  order  to  force  some  synchronizations,  and  so  if  we  want  to 
check  P  over  A  and  B  separately,  we  must  be  granted  that  P  is  preserved  by 
both  parallel  and  restriction  operators. 

To  obtain  the  result  for  A'P{F),  we  essentially  apply  -  in  a  syntax-driven 
way  -  the  four  rules  above  recursively,  obtaining  a  proof  tree  having  (the  value 
of)  A'P(F)  as  the  root  and  the  various  (values  of)  Ap(f?)’s  on  the  leaves  for 
the  subterms  E  of  F  on  which  the  induction  cannot  be  applied  anymore.  The 
following  theorem  justifies  the  correctness  of  the  compositional  algorithm,  by 
proving  that  the  evaluation  strategy  terminates  and  gives  the  same  result  as  the 
given  algorithm  AP(F). 

Theorem  12.  Let  F  €  Sfs-  If  the  agent  Ef  occurring  in  step  1  belongs  to 
£fs  each  time  the  algorithm  A!P  executes  that  step,  then  AfP(F)  terminates  and 
AP(F)  =  AfP(F). 

Proof.  First  we  want  to  prove  that,  in  computing  A'P(F ),  if  the  evaluation  of 
the  given  algorithm  AP  is  required  on  an  agent  E,  then  E  belongs  to  Sfs *  The 
proof  is  by  induction  on  the  proof  tree  for  the  evaluation  of  AP(F ).  The  base 
case  is  when  F  can  be  evaluated  by  step  4;  as  -  by  hypothesis  -  agent  F  is  finite 
state,  the  thesis  follows  trivially.  Instead,  if  F  is  of  the  form  E'  \  L,  then  -  by 
the  premise  of  this  theorem  ~  Ef  £  SFs>  and  the  inductive  hypothesis  can  be 
applied.  In  step  2,  as  F  =  Ei\E2,  we  have  that  EUE2  G  £fs ,  and  the  inductive 


Command:  c_sbsnni  Access  -Monitor  .5 
[\]  Verifying  AM  I  Interf 
[|]  Verifying  AM 

[\]  Verifying  Monitor_5  I  0bject_10  |  Object_hO 
[|]  Verifying  Monitor_5 
[|]  Failed! 

[\]  Failed! 

Verifying  directly  (Monitor_5  |  0bject_10  I  Object _hO)\L 
[|]  Failed! 

[\]  Failed! 

Verifying  directly  (AM  I  Interf) \K 

true 


Table  14.  Verification  of  SBSNNI  on  Access -Monitor.  5  with  the  compositional  algo¬ 
rithm. 


Access  Monitor  _6  (AM  j 6  |  Interf  .6)  \  N 

AM. 6  =f  ((Monitor _5  |  Object(  1, 0)  |  Object( 0, 0) 

|  hBuf  (empty))  \  L)[res(0,  y)/val(Q}  y)] 
hBuf(j)  d=  res (l,j).hBuf  (empty)  +  val(l,k).hBuf(k) 
Interf _6  =f  Interf. 6(0)  |  Interf- 6(1) 

Interf  .6(1)  a.r(l ,  r).dccess-f(Z,  x). Inter f.6.reply(l) 

+ 

a.w(lt  x>  z).access.w(ly  x ,  z)  .Interf  .6(1) 

Interf .6 .reply (l)  d=  res(Z,y). 

(  if  y  =  empty  then 
Interf  .6  jreply(l) 
else 

put(ly  y).  Interf  .6(1)) 


Table  15.  The  Access  JMonitor. 6. 


In  fact,  suppose  we  want  to  add  other  objects  to  Access  -Monitor -6]  in  such  a 
case,  the  size  of  AMI 5  will  increase  exponentially  with  respect  to  the  number  of 
added  objects.  Now  we  present  a  rather  modular  version  of  the  access  monitor. 
The  basic  idea  of  this  new  version  (Figure  19)  is  that  every  object  has  a  “pri¬ 
vate”  monitor  which  implements  the  access  functions  for  such  (single)  object.  To 
make  this,  we  have  decomposed  process  Monitor- 5  into  two  different  processes, 
one  for  each  object;  then  we  have  composed  such  processes  to  their  respective 
objects  together  with  a  high  level  buffer  obtaining  the  SBSNNI-secure  Modh 
and  Modi  agents.  In  particular,  Monitor -7  (x)  handles  the  accesses  to  object  x 
(x  =  0  low,  x  —  1  high).  As  in  Acces s -Monitor -6 ,  we  have  an  interface  which 
guarantees  the  exclusive  use  of  the  monitor  within  the  same  level  and  is  able 
to  read  values  from  the  high  buffer.  The  resulting  system  is  reported  in  Ta- 


Access -Monitor .7  =f  (Modh  \  Modi  |  Inter  f -6)  \  L 

Modh  *==  ((Monitor -7(1)  |  Object(l}  0)  |  hBuf  (empty))  \  Lh) 
[res(0}y)/val(0,y)] 

Modi  =f  ((Monitor- 7(0)  |  Object(0 ,  0)  |  hBuf  (empty))  \  Lh) 
[res(0,y)/val(0,y)] 

Monitor- 7(x)  d=  access  jt (l,  x). 

(if  x  <  l  then 

r(x,  y).val(li  y). Monitor -7(x) 
else 

val(l,  err). Monitor- 7(x)) 

+ 

access  jw (l}  x,  z). 

(  if  x  >  l  then 

w(x,  z). Monitor -7 (x) 
else 

Monitor -7 (x)) 


Table  17.  The  Access -Monitor -7. 


ble  17  where  L  =  {res,  accessjr,  accessjw}  and  Lh  =  {r,w,val(l,y)}.  Table  18 
reports  the  output  of  the  (successful)  verification  of  the  SBSNNI  property  for 
Access  -Monitor  J.  This  task  takes  about  20  seconds  on  a  SUN5  workstation, 
supporting  our  claim  that  a  modular  definition  would  help.  Moreover,  we  can 
also  check  the  new  version  of  the  monitor  is  functionally  equivalent  to  the  pre¬ 
vious  ones:  in  about  5  minutes,  CoSeC  is  able  to  check  that  Access -Monitor -7 
Access -Monitor- 5,  and  so  also  Access -Monitor -7  Access -Monitor  Jo. 


In  recent  papers  [28,3],  the  underlying  model  has  been  extended  in  order 
to  deal  with  time  and  probability.  Once  an  appropriate  semantics  equivalence 
has  been  defined  in  these  new  models,  the  BNDC  property  has  been  shown  to 
naturally  handle  the  new  features  of  the  model.  In  particular,  in  such  models, 
BNDC  has  been  shown  to  be  able  to  detect  timing  and  probabilistic  convert 
channels,  respectively. 

Another  aspect  we  are  studying  is  the  possibility  of  defining  a  criterion  for 
evaluating  the  quality  of  information  flow  properties  [33].  We  are  trying  to  do 
this  by  defining  classes  of  properties  which  guarantee  the  impossibility  of  the 
construction  of  some  “canonical”  channels.  We  have  seen,  for  example,  that  using 
some  systems  which  are  not  BNDC  it  is  possible  to  obtain  a  (initially  noisy) 
perfect  channel  from  high  to  low  level.  The  aim  is  to  classify  the  information 
flow  properties  depending  on  which  kind  of  channels  they  effectively  rule  out. 

We  have  seen  that  it  is  possible  to  automatically  check  almost  all  the  prop¬ 
erties  we  have  presented.  Indeed  we  are  still  looking  for  a  good  (necessary  and 
sufficient)  characterization  of  the  BNDC  property.  We  have  also  briefly  pre¬ 
sented  the  CoSeC  tool.  In  [47],  Martinelli  has  applied  partial  model  checking 
techniques  to  the  verification  of  BNDC ,  leading  to  the  implementation  of  an 
automatic  verifier  [46]  which  is  able  to  automatically  synthetize  the  possible 
interfering  high-level  process. 

As  we  have  stated  above,  the  setting  we  have  proposed  is  quite  general.  We 
claim  that  information  flow  (or  NI)  properties  could  have  a  number  of  different 
applications  since  they  basically  capture  the  possibility  for  a  class  of  users  of 
modifying  the  behaviour  of  another  user  class.  This  generality  has  allowed  to 
apply  some  variants  of  our  properties  to  the  analysis  of  cryptographic  protocols 
[15,30,29,31],  starting  from  a  general  scheme  proposed  in  [34].  This  has  been 
the  topic  of  the  second  part  of  the  course  “Classification  of  Security  Properties” 
at  FOSAD’OO  school,  and  we  are  presently  working  on  a  tutorial  which  will  cover 
it  [27]. 

This  application  of  NI  properties  to  network  security  is  new  to  our  knowl¬ 
edge.  The  interesting  point  is  that  they  can  be  applied  to  the  verification  of 
protocols  with  different  aims,  e.g.,  authentication,  secrecy,  key-distribution.  We 
have  analyzed  a  number  of  different  protocols,  thanks  to  a  new  tool  interface 
which  permits  to  specify  value-passing  protocols  and  to  automatically  generate 
the  enemy  [15];  this  has  also  allowed  to  find  new  anomalies  in  some  cryptographic 
protocols  [16]. 

In  [19,20,11,12],  a  new  definition  of  entity  authentication,  which  is  based 
on  explicit  locations  of  entities,  has  been  proposed.  We  are  presently  trying 
to  characterize  also  this  property  through  information  flow.  We  also  intend  to 
carry  the  BNDC  theory  over  more  expressive  process  calculi,  like,  e.g.,  pi/spi- 
calculus  [2]  and  Mobile  Ambients  [13].  This  would  allow  to  compare  it  with 
new  recent  security  properties  proposed  on  such  calculi  and  reminiscent  of  some 
Non-Interference  ideas  (see,  e.g.,  [37,35]). 
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Abstract.  Access  control  is  the  process  of  mediating  every  request  to 
resources  and  data  maintained  by  a  system  and  determining  whether 
the  request  should  be  granted  or  denied.  The  access  control  decision  is 
enforced  by  a  mechanism  implementing  regulations  established  by  a  secu¬ 
rity  policy.  Different  access  control  policies  can  be  applied,  corresponding 
to  different  criteria  for  defining  what  should,  and  what  should  not,  be 
allowed,  and,  in  some  sense,  to  different  definitions  of  what  ensuring  se¬ 
curity  means.  In  this  chapter  we  investigate  the  basic  concepts  behind 
access  control  design  and  enforcement,  and  point  out  different  security 
requirements  that  may  need  to  be  taken  into  consideration.  We  discuss 
several  access  control  policies,  and  models  formalizing  them,  that  have 
been  proposed  in  the  literature  or  that  axe  currently  under  investigation. 


1  Introduction 

An  important  requirement  of  any  information  management  system  is  to  protect 
data  and  resources  against  unauthorized  disclosure  ( secrecy )  and  unauthorized 
or  improper  modifications  ( integrity ),  while  at  the  same  time  ensuring  their  avail¬ 
ability  to  legitimate  users  (no  denials-of -service).  Enforcing  protection  therefore 
requires  that  every  access  to  a  system  and  its  resources  be  controlled  and  that 
all  and  only  authorized  accesses  can  take  place.  This  process  goes  under  the 
name  of  access  control.  The  development  of  an  access  control  system  requires 
the  definition  of  the  regulations  according  to  which  access  is  to  be  controlled 
and  their  implementation  as  functions  executable  by  a  computer  system.  The 
development  process  is  usually  carried  out  with  a  multi-phase  approach  based 
on  the  following  concepts: 

Security  policy:  it  defines  the  (high-level)  rules  according  to  which  access  con¬ 
trol  must  be  regulated.1 

1  Often,  the  term  policy  is  also  used  to  refer  to  particular  instances  of  a  policy,  that 
is,  actual  authorizations  and  access  restrictions  to  be  enforced  (e.g.,  Employees  can 
read  bulletin- board). 
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example,  from  laws,  practices,  and  organizational  regulations.  A  security  policy 
must  capture  all  the  different  regulations  to  be  enforced  and,  in  addition,  must 
also  consider  possible  additional  threats  due  to  the  use  of  a  computer  system. 
Access  control  policies  can  be  grouped  into  three  main  classes: 

Discretionary  (DAC)  (authorization-based)  policies  control  access  based  on 
the  identity  of  the  requestor  and  on  access  rules  stating  what  requestors  are 
(or  are  not)  allowed  to  do. 

Mandatory  (MAC)  policies  control  access  based  on  mandated  regulations  de¬ 
termined  by  a  central  authority. 

Role-based  (RBAC)  policies  control  access  depending  on  the  roles  that  users 
have  within  the  system  and  on  rules  stating  what  accesses  are  allowed  to 
users  in  given  roles. 

Discretionary  and  role-based  policies  are  usually  coupled  with  (or  include)  an 
administrative  policy  that  defines  who  can  specify  authorizations/rules  governing 
access  control. 

In  this  chapter  we  illustrate  different  access  control  policies  and  models  that 
have  been  proposed  in  the  literature,  also  investigating  their  low  level  imple¬ 
mentation  in  terms  of  security  mechanisms.  In  illustrating  the  literature  and  the 
current  status  of  access  control  systems,  of  course,  the  chapter  does  not  pretend 
to  be  exhaustive.  However,  by  discussing  different  approaches  with  their  advan¬ 
tages  and  limitations,  this  chapter  hopes  to  give  an  idea  of  the  different  issues  to 
be  tackled  in  the  development  of  an  access  control  system,  and  of  good  security 
principles  that  should  be  taken  into  account  in  the  design. 

The  chapter  is  structured  as  follows.  Section  2  introduces  the  basic  concepts 
of  discretionary  policies  and  authorization-based  models.  Section  3  shows  the 
limitation  of  authorization-based  controls  to  introduce  the  basis  for  the  need  of 
mandatory  policies,  which  are  then  discussed  in  Section  4.  Section  5  illustrates 
approaches  combining  mandatory  and  discretionary  principles  to  the  goal  of 
achieving  mandatory  information  flow  protection  without  loosing  the  flexibility 
of  discretionary  authorizations.  Section  6  illustrates  several  discretionary  poli¬ 
cies  and  models  that  have  been  proposed.  Section  7  illustrates  role-based  access 
control  policies.  Finally,  Section  8  discusses  advanced  approaches  and  directions 
in  the  specification  and  enforcement  of  access  control  regulations. 

2  Basic  concepts  of  discretionary  policies 

Discretionary  policies  enforce  access  control  on  the  basis  of  the  identity  of  the 
requestors  and  explicit  access  rules  that  establish  who  can,  or  cannot,  execute 
which  actions  on  which  resources.  They  are  called  discretionary  as  users  can  be 
given  the  ability  of  passing  on  their  privileges  to  other  users,  where  granting 
and  revocation  of  privileges  is  regulated  by  an  administrative  policy.  Different 
discretionary  access  control  policies  and  models  have  been  proposed  in  the  liter¬ 
ature.  We  start  in  this  section  with  the  early  discretionary  models,  to  convey  the 
basic  ideas  of  authorization  specifications  and  their  enforcement.  We  will  come 
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Ann 

own 

read 

write 

read 

write 

execute 

Bob 

read 

read 

write 

Carl 

read 

execute 

read 

Fig.  1.  An  example  of  access  matrix 


describe  changes  to  the  state  of  a  system.  These  operations,  whose  effect  on  the 
authorization  state  is  illustrated  in  Figure  2,  correspond  to  adding  and  removing 
a  subject,  adding  and  removing  an  object,  and  adding  and  removing  a  privilege. 
Each  command  has  a  conditional  part  and  a  body  and  has  the  form 

command  c(x\ , . . . ,  Xk ) 

if  ri  in  A[xSl,x0l]  and 
r2  in  A[xS2,x02)  and 


rm  in  A[xSm,x0rn] 

then  opi 
opi 


OPn 

end. 

with  n  >  0 ,m  >  0.  Here  are  actions,  opi,...,opn  are  primitive 

operations,  while  si, ...,  sm  and  oi, ...,  om  are  integers  between  1  and  k.  If  m— 0, 
the  command  has  no  conditional  part. 

For  example,  the  following  command  creates  a  file  and  gives  the  creating 
subject  ownership  privilege  on  it. 

command  CREATE  (creator, file) 
create  object  file 

enter  Own  into  A  [creator, file]  end. 

The  following  commands  allow  an  owner  to  grant  to  others,  and  revoke  from 
others,  a  privilege  to  execute  an  action  on  her  files. 

command  C0NFERo  (owner , friend  ,file) 
if  Own  in  A[owner,file] 
then  enter  a  into  A  [friend, file]  end. 
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For  instance,  a  copy  flag ,  denoted  *,  attached  to  a  privilege  may  indicate  that 
the  privilege  can  be  transferred  to  others.  Granting  of  authorizations  can  then 
be  accomplished  by  the  execution  of  commands  like  the  one  below  (again  here 
TRANSFER^  is  an  abbreviation  for  as  many  commands  as  there  are  actions). 

command  TRANS FERa  (sub j , friend ,file) 
if  a*  in  A[subj,file] 

then  enter  a  into  A  [friend, file]  end. 

The  ability  of  specifying  commands  of  this  type  clearly  provides  flexibility  as 
different  administrative  policies  can  be  taken  into  account  by  defining  appropri¬ 
ate  commands.  For  instance,  an  alternative  administrative  flag  (called  transfer 
only  and  denoted  +)  can  be  supported,  which  gives  the  subject  the  ability  of 
passing  on  the  privilege  to  others  but  for  which,  so  doing,  the  subject  looses 
the  privilege.  Such  a  flexibility  introduces  an  interesting  problem  referred  to  as 
safety ,  and  concerned  with  the  propagation  of  privileges  to  subjects  in  the  sys¬ 
tem.  Intuitively,  given  a  system  with  initial  configuration  Q,  the  safety  problem 
is  concerned  with  determining  whether  or  not  a  given  subject  s  can  ever  acquire 
a  given  access  a  on  an  object  o,  that  is,  if  there  exists  a  sequence  of  requests 
that  executed  on  Q  can  produce  a  state  Q '  where  a  appears  in  a  cell  A[s,o] 
that  did  not  have  it  in  Q.  (Note  that,  of  course,  not  all  leakages  of  privileges 
are  bad  and  subjects  may  intentionally  transfer  their  privileges  to  “trus  worthy” 
subjects.  Trustworthy  subjects  are  therefore  ignored  in  the  analysis.)  It  turns 
out  that  the  safety  problem  is  undecidable  in  general  (it  can  be  reduced  to 
the  halting  problem  of  a  Turing  machine)  [4].  It  remains  instead  decidable  for 
cases  where  subjects  and  objects  are  finite,  and  in  mono- operational  systems, 
that  is,  systems  where  the  body  of  commands  can  have  at  most  one  opera¬ 
tion  (while  the  conditional  part  can  still  be  arbitrarily  complex).  However,  as 
noted  in  [81],  mono-operational  systems  have  the  limitation  of  making  create 
operations  pretty  useless:  a  single  create  command  cannot  do  more  than  adding 
an  empty  row/column  (it  cannot  write  anything  in  it).  It  is  therefore  not  possi¬ 
ble  to  support  ownership  or  control  relationships  between  subjects.  Progresses  in 
safety  analysis  were  made  in  a  later  extension  of  the  HRU  model  by  Sandhu  [81], 
who  proposed  the  TAM  (Typed  Access  Matrix)  model.  TAM  extends  HRU  with 
strong  typing:  each  subject  and  object  has  a  type;  the  type  is  associated  with  the 
subjects/objects  when  they  are  created  and  thereafter  does  not  change.  Safety 
results  decidable  in  polynomial  time  for  cases  where  the  system  is  monotonic 
(privileges  cannot  be  deleted),  commands  are  limited  to  three  parameters,  and 
there  are  no  cyclic  creates.  Safety  remains  undecidable  otherwise. 


2.2  Implementation  of  the  access  matrix 

Although  the  matrix  represents  a  good  conceptualization  of  authorizations,  it 
is  not  appropriate  for  implementation.  In  a  general  system,  the  access  matrix 
will  be  usually  enormous  in  size  and  sparse  (most  of  its  cells  are  likely  to  be 
empty).  Storing  the  matrix  as  a  two-dimensional  array  is  therefore  a  waste  of 
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Fig.  3.  Authorization  table,  ACLs,  and  capabilities  for  the  matrix  in  Figure  1 


Authorizations  are  represented  by  associating  with  each  object  an  access  control 
list  of  9  bits:  bits  1  through  3  reflect  the  privileges  of  the  file’s  owner,  bits  4 
through  6  those  of  the  user  group  to  which  the  file  belongs,  and  bits  7  through  9 
those  of  all  the  other  users.  The  three  bits  correspond  to  the  read  (r),  write  (w), 
and  execute  (x)  privilege,  respectively.  For  instance,  ACL  rwxr-x— x  associated 
with  a  file  indicates  that  the  file  can  be  read,  written,  and  executed  by  its  owner, 
read  and  executed  by  users  belonging  to  the  group  associated  with  the  file,  and 
executed  by  all  the  other  users. 

3  Vulnerabilities  of  the  discretionary  policies 

In  defining  the  basic  concepts  of  discretionary  policies,  we  have  referred  to  ac¬ 
cess  requests  on  objects  submitted  by  users,  which  are  then  checked  againsts  the 
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Fig.  5.  An  example  of  security  lattice 


c2  and  the  categories  of  C\  include  those  of  c2.  Formally,  given  a  totally  ordered 
set  of  security  levels  £,  and  a  set  of  categories  C,  the  set  of  access  classes  is  AC  = 
C  x  p(C )2,  and  Vci  =  (Li,Ci),c2  =  (I/2,(72)  :  ci  >  c2  L\  >  L2  A  C\  3  C2. 
Two  classes  C\  and  c2  such  that  neither  c\  >  c2  nor  c2  >  ci  holds  are  said  to  be 
incomparable. 

It  is  easy  to  see  that  the  dominance  relationship  so  defined  on  a  set  of  access 
classes  AC  satisfies  the  following  properties. 

-  Reflexivity :  Vx  €  AC  :  x  >  x 

-  Transitivity:  Vx,  y,  z  €  AC  :  x  >  y,y  >  z  =>  a:  >  z 

-  Antisymmetry:  Vx, y  €  AC  '•  x  >  y,y  >  x  =>  x  =  y 

-  Existence  of  a  least  upper  bound:  Vx,  y  6  .AC  :  3  !z  £  AC 

•  2  >  x  and  2  >  y 

•  Vt  €  AC  :  t  >  x  and  t  >  y  =>  t  >  z. 

-  Existence  of  a  greatest  lower  bound:  Vx,y  €  .AC  :  3  !,z  6  AC 

•  x  >  2  and  y  >  z 

•  W  €  AC  :  x  >  t  and  y  >  t  =4>  z  >  t. 

Access  classes  defined  as  above  together  with  the  dominance  relationship 
between  them  therefore  form  a  lattice  [31].  Figure  5  illustrates  the  security  lattice 
obtained  considering  security  levels  TS  and  S,  with  TS>S  and  the  set  of  categories 
{Nuclear, Army}. 

The  semantics  and  use  of  the  classifications  assigned  to  objects  and  subjects 
within  the  application  of  a  multilevel  mandatory  policy  is  different  depending 
on  whether  the  classification  is  intended  for  a  secrecy  or  an  integrity  policy.  We 
next  examine  secrecy-based  and  integrity-based  mandatory  policies. 

p(C)  denotes  the  powerset  of  C. 
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Fig.  6.  Information  flow  for  secrecy 


Given  the  no-write-down  principle,  it  is  clear  now  why  users  are  allowed 
to  connect  to  the  system  at  different  access  classes,  so  that  they  are  able  to 
access  information  at  different  levels  (provided  that  they  axe  cleared  for  it).  For 
instance,  Vicky  has  to  connect  to  the  system  at  a  level  below  her  clearance  if  she 
wants  to  write  some  Unclassified  information,  such  as  working  instructions  for 
John.  Note  that  a  lower  class  does  not  mean  “less”  privileges  in  absolute  terms, 
but  only  less  reading  privileges  (see  Figure  6). 

Although  users  can  connect  to  the  system  at  any  level  below  their  clearance, 
the  strict  application  of  the  no-read-up  and  the  no-write-down  principles  may 
result  too  rigid.  Real  world  situations  often  require  exceptions  to  the  mandatory 
restrictions.  For  instance,  data  may  need  to  be  downgraded  (e.g.,  data  subject 
to  embargoes  that  can  be  released  after  some  time).  Also,  information  released 
by  a  process  may  be  less  sensitive  than  the  information  the  process  has  read.  For 
instance,  a  procedure  may  access  personal  information  regarding  the  employees 
of  an  organization  and  return  the  benefits  to  be  granted  to  each  employee.  While 
the  personal  information  can  be  considered  Secret,  the  benefits  can  be  considered 
Confidential.  To  respond  to  situations  like  these,  multilevel  systems  should  then 
allow  for  exceptions,  loosening  or  waiving  restrictions,  in  a  controlled  way,  to 
processes  that  are  trusted  and  ensure  that  information  is  sanitized  (meaning  the 
sensitivity  of  the  original  information  is  lost). 

Note  also  that  DAC  and  MAC  policies  are  not  mutually  exclusive,  but  can 
be  applied  jointly.  In  this  case,  an  access  to  be  granted  needs  both  i)  the  ex¬ 
istence  of  the  necessary  authorization  for  it,  and  ii)  to  satisfy  the  mandatory 
policy.  Intuitively,  the  discretionary  policy  operates  within  the  boundaries  of  the 
mandatory  policy:  it  can  only  restrict  the  set  of  accesses  that  would  be  allowed 
by  MAC  alone. 
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state  Vo  is  secure,  and  %%)  the  state  transition  T  is  security  preserving,  that  is,  it 
transforms  a  secure  state  into  another  secure  state. 

As  noticed  by  McLean  in  his  example  called  “System  Z”  [63],  the  BST  the¬ 
orem  does  not  actually  guarantee  security.  The  problem  lies  in  the  fact  that  no 
restriction,  but  to  be  preserving  of  state  security,  is  put  on  transitions.  In  his 
System  Z  example,  McLean  shows  how  failing  to  control  transitions  can  com¬ 
promise  security.  Consider  a  system  Z  whose  initial  state  is  secure  and  that  has 
only  one  type  of  transition:  when  a  subject  requests  any  type  of  access  to  an 
object  o,  every  subject  and  object  in  the  system  are  downgraded  to  the  lowest 
possible  access  class  and  the  access  is  granted.  System  Z  satisfies  the  Bell  and 
LaPadula  notion  of  security,  but  it  is  obviously  not  secure  in  any  meaningful 
sense.  The  problem  pointed  out  by  System  Z  is  that  transitions  need  to  be  con¬ 
trolled.  Accordingly,  McLean  proposes  extending  the  model  with  a  new  function 
C  :  S  U  O  ->  p(S),  which  returns  the  set  of  subjects  allowed  to  change  the  level 
of  its  argument.  A  transition  is  secure  if  it  allows  changes  to  the  level  of  a  sub¬ 
ject/object  x  only  by  subjects  in  C{x)\  intuitively,  these  are  subjects  trusted  for 
downgrading.  A  system  (uo,  R ,  T)  is  secure  if  and  only  if  i)  Vo  is  secure,  ii)  every 
state  reachable  from  vq  by  executing  a  finite  sequence  of  one  or  more  requests 
from  R  is  (BLP)  secure,  and  Hi)  T  is  transition  secure. 

The  problem  with  changing  the  security  level  of  subjects  and  objects  was 
not  captured  formally  as  an  axiom  or  property  in  the  Bell  and  LaPadula,  but  as 
an  informal  design  guidance  called  tranquility  principle.  The  tranquility  princi¬ 
ple  states  that  the  classification  of  active  objects  should  not  be  changed  during 
normal  operation  [55].  A  subsequent  revision  of  the  model  [10]  introduced  a 
distinction  between  the  level  assigned  to  a  subject  ( clearance )  and  its  current 
level  (which  could  be  any  level  dominated  by  the  clearance),  which  also  im¬ 
plied  changing  the  formulation  of  the  axioms,  introducing  more  flexibility  in  the 
control. 

Another  property  included  in  the  Bell  and  LaPadula  model  is  the  discre¬ 
tionary  property  which  constraints  the  set  of  current  accesses  b  to  be  a  subset  of 
the  access  matrix  M.  Intuitively,  it  enforces  discretionary  controls. 

4.4  Integrity-based  mandatory  policies:  The  Biba  model 

The  mandatory  policy  that  we  have  discussed  above  protects  only  the  confiden¬ 
tiality  of  the  information;  no  control  is  enforced  on  its  integrity.  Low  classified 
subjects  could  still  be  able  to  enforce  improper  indirect  modifications  to  objects 
they  cannot  write.  With  reference  to  our  organization  example,  for  instance,  in¬ 
tegrity  could  be  compromised  if  the  Trojan  Horse  implanted  by  John  in  the  ap¬ 
plication  would  write  data  in  file  Market  (this  operation  would  not  be  blocked  by 
the  secrecy  policy).  Starting  from  the  principles  of  the  Bell  and  LaPadula  model, 
Biba  [16]  proposed  a  dual  policy  for  safeguarding  integrity,  which  controls  the 
flow  of  information  and  prevents  subjects  to  indirectly  modify  information  they 
cannot  write.  Like  for  secrecy,  each  subject  and  object  in  the  system  is  assigned 
an  integrity  classification.  The  classifications  and  the  dominance  relationship 
between  them  are  defined  as  before.  Example  of  integrity  levels  can  be:  Crucial 
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ever,  if  a  subject  s  writes  an  object  o,  the  object  has  its  classification  down¬ 
graded  to  the  greatest  lower  bound  of  the  classification  of  the  two,  that  is, 
A'(o)  =glb(A(s),A(o)). 

Intuitively,  the  two  policies  attempt  to  apply  a  more  dynamic  behavior  in  the 
enforcement  of  the  constraints.  The  two  approaches  suffer  however  of  drawbacks. 
In  the  low-water  mark  for  subjects  approach,  the  ability  of  a  subject  to  execute 
a  procedure  may  depend  on  the  order  with  which  operations  are  requested:  a 
subject  may  be  denied  the  execution  of  a  procedure  because  of  read  operations 
executed  before.  The  latter  policy  cannot  actually  be  considered  as  safeguarding 
integrity:  given  that  subjects  are  allowed  to  write  above  their  level,  integrity 
compromises  can  certainly  occur;  by  downgrading  the  level  of  the  object  the 
policy  simply  signals  this  fact. 

As  it  is  visible  from  Figures  6  and  7,  secrecy  policies  allow  the  flow  of  informa¬ 
tion  only  from  lower  to  higher  (secrecy)  classes  while  integrity  policies  allow  the 
flow  of  information  only  from  higher  to  lower  (integrity)  classes.  If  both  secrecy 
and  integrity  have  to  be  controlled,  objects  and  subjects  have  to  be  assigned  two 
access  classes,  one  for  secrecy  control  and  one  for  integrity  control. 

A  major  limitation  of  the  policies  proposed  by  Biba  is  that  they  only  capture 
integrity  compromises  due  to  impoproper  information  flows.  However,  integrity 
is  a  much  broader  concept  and  additional  aspects  should  be  taken  into  account 
(see  Section  6.5). 


4.5  Applying  mandatory  policies  to  databases 

The  first  formulation  of  the  multilevel  mandatory  policies,  and  the  Bell  LaPadula 
model,  simply  assumed  the  existence  of  objects  (information  container)  to  which 
a  classification  is  assigned.  This  assumption  works  well  in  the  operating  system 
context,  where  objects  to  be  protected  are  essentially  files  containing  the  data. 
Later  studies  investigated  the  extension  of  mandatory  policies  to  database  sys¬ 
tems.  While  in  operating  systems  access  classes  are  assigned  to  files,  database 
systems  can  afford  a  finer-grained  classification.  Classification  can  in  fact  be  con¬ 
sidered  at  the  level  of  relations  (equivalent  to  file-level  classification  in  OS),  at 
the  level  of  columns  (different  properties  can  have  a  different  classification),  at 
the  level  of  rows  (properties  referred  to  a  given  real  world  entity  or  association 
have  the  same  classification),  or  at  the  level  of  single  cells  (each  data  element, 
meaning  the  value  assigned  to  a  property  for  a  given  entity  or  association,  can 
have  a  different  classification),  this  latter  being  the  finest  possible  classification. 
Early  efforts  to  classifying  information  in  database  systems,  considered  classi¬ 
fication  at  the  level  of  each  single  element  [50,61].  Element-level  classification 
is  clearly  appealing  since  it  allows  the  assignment  of  a  security  class  to  each 
single  real  world  fact  that  needs  to  be  represented.  For  instance,  an  employee’s 
name  can  be  labeled  Unclassified,  while  his  salary  can  be  labeled  Secret;  also 
the  salary  of  different  employees  can  take  on  different  classifications.  However, 
the  support  of  fine-grained  classifications  together  with  the  obvious  constraint  of 
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Fig.  9.  An  example  of  a  relation  with  polyinstantiation  (a)  and  the  Unclassified  view 
on  it  (b) 


convey  information,  the  Unclassified  subject  should  see  no  difference  between 
values  that  are  actually  null  in  the  database  and  those  that  are  null  since  they 
have  a  higher  classification.5  To  produce  a  view  consistent  with  the  relational 
database  constraints  the  classification  needs  to  satisfy  at  least  the  following  two 
basic  constraints:  i)  the  key  attributes  must  be  uniformly  classified,  and  ii)  the 
classifications  of  nonkey  attributes  must  dominate  that  of  key  attributes.  If  it 
were  not  so,  the  view  at  some  levels  would  contain  a  null  value  for  some  or  all 
key  attributes  (and  therefore  would  not  satisfy  the  key  constraints). 

To  see  how  polyinstantiation  can  arise,  suppose  that  an  Unclassified  subject, 
whose  view  on  the  table  in  Figure  8(a)  is  as  illustrated  in  Figure  8(b),  requests 
insertion  of  tuple  (Ann,  Deptl,  100K).  According  to  the  key  constraints  im¬ 
posed  by  the  relational  model,  no  two  tuples  can  have  the  same  value  for  the 
key  attributes.  Therefore  if  classifications  were  not  taken  into  account,  the  in¬ 
sertion  could  have  not  been  accepted.  The  database  could  have  two  alternative 
choices:  i)  tell  the  subject  that  a  tuple  with  the  same  key  already  exists,  or  ii) 
replace  the  old  tuple  with  the  new  one.  The  first  solution  introduces  a  covert 
channel 6,  since  by  rejecting  the  request  the  system  would  be  revealing  protected 
information  (meaning  the  existence  of  a  Secret  entity  named  Ann),  and  clearly 
compromises  secrecy.  On  the  other  hand,  the  second  solution  compromises  in¬ 
tegrity,  since  high  classified  data  would  be  lost,  being  overridden  by  the  newly 
inserted  tuple.  Both  solutions  are  therefore  inapplicable.  The  only  remaining  so¬ 
lution  would  then  be  to  accept  the  insertion  and  manage  the  presence  of  both 
tuples  (see  Figure  9(a)).  Two  tuples  would  then  exist  with  the  same  value,  but 
different  classification,  for  their  key  ( polyinstantiated  tuples).  A  similar  situation 
happens  if  the  unclassified  subject  requests  to  update  the  salary  of  Sam  to  value 
100K.  Again,  telling  the  subject  that  a  value  already  exists  would  compromise 
secrecy  (if  the  subject  is  not  suppose  to  distinguish  between  real  nulls  and  values 

5  Some  proposals  do  not  adopt  this  assumption.  For  instance,  in  LDV  [43],  a  special 
value  “restricted”  appears  in  a  subject’s  view  to  denote  the  existence  of  values  not 
visible  to  the  subject. 

6  We  will  talk  more  about  covert  channels  in  Section  4.6. 
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only  at  a  higher  level).  The  presence  of  two  tuples  with  the  same  key  and  same 
key  classification  but  that  differ  for  the  value  and  classification  of  some  of  its 
attributes  can  be  interpreted  as  a  single  real  world  entity  for  which  different 
values  are  recorded  (corresponding  to  the  different  beliefs  at  different  levels). 
However,  unfortunately,  poly  instantiation  quickly  goes  out  of  hand,  and  the 
execution  of  few  operations  could  result  in  a  database  whose  semantics  does  not 
appear  clear  anymore.  Subsequent  work  tried  to  establish  constraints  to  maintain 
semantic  integrity  of  the  database  status  [69, 75,90].  However,  probably  because 
of  all  the  complications  and  semantics  confusion  that  polyinstantiation  bears, 
fine-grained  multilevel  databases  did  not  have  much  success,  and  current  DBMSs 
do  not  support  element-level  classification.  Commercial  systems  (e.g.,  Trusted 
Oracle  [66]  and  SYBASE  Secure  SQL  Server)  support  tuple  level  classification. 

It  is  worth  noticing  that,  although  poly  instantiation  is  often  blamed  to  be  the 
reason  why  multilevel  relational  databases  did  not  have  success,  polyinstantia- 
tion  is  not  necessarily  always  bad.  Controlled  polyinstantiation  may,  for  example, 
be  useful  to  support  cover  stories  [38, 49],  meaning  non-true  data  whose  presence 
in  the  database  is  meant  to  hide  the  existence  of  the  actual  value.  Cover  stories 
are  useful  when  the  fact  that  a  given  data  is  not  released  is  by  itself  a  cause  of 
information  leakage.  For  instance,  suppose  that  a  subject  requires  access  to  a 
hospital’s  data  and  the  hospital  returns,  for  all  its  patients,  but  for  few  of  them, 
the  illness  for  which  they  are  being  cured.  Suppose  also  that  HIV  never  appears 
as  an  illness  value.  Observing  this,  the  recipient  may  infer  that  it  is  probably 
the  case  that  the  patients  for  which  illness  is  not  disclosed  suffer  from  HIV.  The 
hospital  could  have  avoided  exposure  to  such  an  inference  by  simply  releasing 
a  non-true  alternative  value  ( cover  story)  for  these  patients.  Intuitively,  cover 
stories  are  “lies”  that  the  DBMS  says  to  uncleared  subjects  not  to  disclose  (di¬ 
rectly  or  indirectly)  the  actual  values  to  be  protected.  We  do  note  that,  while 
cover  stories  are  useful  for  protection,  they  have  raise  objections  for  the  possible 
integrity  compromises  which  they  may  indirectly  cause,  as  low  level  subjects  can 
base  their  actions  on  cover  stories  they  believe  true. 

A  complicating  aspects  in  the  support  of  a  mandatory  policy  at  a  fine-grained 
level  is  that  the  definition  of  the  access  class  to  be  associated  with  each  piece 
of  data  is  not  always  easy  [30].  This  is  the  case,  for  example,  of  association  and 
aggregation  requirements,  where  the  classification  of  a  set  of  values  (properties, 
resp.)  is  higher  than  the  classification  of  each  of  the  values  singularly  taken. 
As  an  example,  while  names  and  salaries  in  an  organization  may  be  considered 
Unclassified,  the  association  of  a  specific  salary  with  an  employee’s  name  can 
be  considered  Secret  (association  constraint).  Similarly,  while  the  location  of  a 
single  military  ship  can  be  Unclassified,  the  location  of  all  the  ships  of  a  fleet 
can  be  Secret  (aggregation  constraint),  as  by  knowing  it  one  could  infer  that 
some  operations  are  being  planned.  Proper  data  classification  assignment  is  also 
complicated  by  the  need  to  take  into  account  possible  inference  channels  [30, 
47, 59].  There  is  an  inference  channel  between  a  set  of  data  x  and  a  set  of  data 
y  if,  by  knowing  x  a  user  can  infer  some  information  on  y  (e.g.,  an  inference 
channel  can  exist  between  an  employee’s  taxes  and  her  salary).  Inference-aware 
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only  overt  channels  of  information  (i.e.,  flow  through  legitimate  channels);  they 
still  remain  vulnerable  to  covert  channels.  Covert  channels  axe  channels  that 
are  not  intended  for  normal  communication,  but  still  can  be  exploited  to  infer 
information.  For  instance,  consider  the  request  of  a  low  level  subject  to  write  a 
non-existent  high  level  file  (the  operation  is  legitimate  since  write-up  operations 
are  allowed).  Now,  if  the  system  returns  the  error,  it  exposes  itself  to  improper 
leakages  due  to  malicious  high  level  processes  creating  and  destroying  the  high 
level  file  to  signal  information  to  low  processes.  However,  if  the  low  process  is  not 
informed  of  the  error,  or  the  system  automatically  creates  the  file,  subjects  may 
not  be  signalled  possible  errors  made  in  legitimate  attempts  to  write.  As  another 
example,  consider  a  low  level  subject  that  requires  a  resource  (e.g.,  CPU  or  lock) 
that  is  busy  by  a  high  level  subject.  The  system,  by  not  allocating  the  resource 
because  it  is  busy,  can  again  be  exploited  to  signal  information  at  lower  levels 
(high  level  processes  can  module  the  signal  by  requiring  or  releasing  resources) . 
If  a  low  process  can  see  any  different  result  due  to  a  high  process  operation,  there 
is  a  channel  between  them.  Channels  may  also  be  enacted  without  modifying  the 
system’s  response  to  processes.  This  is,  for  example,  the  case  of  timing  channels , 
that  can  be  enacted  when  it  is  possible  for  a  high  process  to  affect  the  system’s 
response  time  to  a  low  process.  With  timing  channels  the  response  that  the 
low  process  receives  is  always  the  same,  it  is  the  time  at  which  the  low  process 
receives  the  response  that  communicates  information.  Therefore,  in  principle, 
any  common  resource  or  observable  property  of  the  system  state  can  be  used 
to  leak  information.  Consideration  of  covert  channels  requires  particular  care  in 
the  design  of  the  enforcement  mechanism.  For  instance,  locking  and  concurrency 
mechanisms  must  be  revised  and  be  properly  designed  [7].  A  complication  in 
their  design  is  that  care  must  be  taken  to  avoid  the  policy  for  blocking  covert 
channels  to  introduce  denials-of-service.  For  instance,  a  trivial  solution  to  avoid 
covert  channels  between  high  and  low  level  processes  competing  over  common 
resources  could  be  to  always  give  priority  to  low  level  processes  (possibly  ter¬ 
minating  high  level  processes).  This  approach,  however,  exposes  the  systems  to 
denials-of-service  attacks  whereby  low  level  processes  can  impede  high  level  (and 
therefore,  presumably,  more  important)  processes  to  complete  their  activity. 

Covert  channels  are  difficult  to  control  also  because  of  the  difficulty  of  map¬ 
ping  an  access  control  model’s  primitive  to  a  computer  system  [64].  For  this 
reason,  covert  channels  analysis  is  usually  carried  out  in  the  implementation 
phase,  to  make  sure  that  the  implementation  of  the  model’s  primitive  is  not  too 
weak.  Covert  channel  analysis  can  be  based  on  tracing  the  information  flows 
in  programs  [31],  checking  programs  for  shared  resources  that  can  be  used  to 
transfer  information  [52],  or  checking  the  system  clock  for  timing  channels  [92]. 
Beside  the  complexity,  the  limitation  of  such  solutions  is  that  covert  channels 
are  found  out  at  the  end  of  the  development  process,  where  system  changes  are 
much  more  expensive  to  correct.  Interface  models  have  been  proposed  which  at¬ 
tempt  to  rule  out  covert  channels  analysis  in  the  modeling  phase  [64, 37].  Rather 
than  specifying  a  particular  method  to  enforce  security,  interface  models  specify 
restrictions  on  a  system’s  input/output  that  must  be  obeyed  to  avoid  covert 
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Conflict  of  interest  class  Conflict  of  interest  class 


Fig.  12.  An  example  of  object  organization 


-  belongs  to  an  entirely  different  conflict  of  interest  class. 

♦-property  Write  access  is  only  permitted  if 

-  access  is  permitted  by  the  simple  security  rule,  and 

-  no  object  can  be  read  which  i)  is  in  a  different  company  dataset  than 
the  one  for  which  write  access  is  requested,  and  ii)  contains  unsanitized 
information. 

The  term  subject  used  in  the  properties  is  to  be  interpreted  as  user  (mean¬ 
ing  access  restrictions  are  referred  to  users).  The  reason  for  this  is  that,  unlike 
mandatory  policies  that  control  processes,  the  Chinese  Wall  policy  controls  users. 
It  would  therefore  not  make  sense  to  enforce  restrictions  on  processes  as  a  user 
could  be  able  to  acquire  information  about  organizations  that  are  in  conflict  of 
interest  simply  running  two  different  processes. 

Intuitively,  the  simple  security  rule  blocks  direct  information  leakages  that 
can  be  attempted  by  a  single  user,  while  the  *-property  blocks  indirect  infor¬ 
mation  leakages  that  can  occur  with  the  collusion  of  two  or  more  users.  For 
instance,  with  reference  to  Figure  12,  an  indirect  improper  flow  could  happen 
if,  i)  a  user  reads  information  from  object  ObjA-1  and  writes  it  into  ObjC-1, 
and  subsequently  ii)  a  different  user  reads  information  from  ObjC-1  and  writes 
it  into  ObjB-1. 

Clearly,  the  application  of  the  Chinese  Wall  policy  still  has  some  limitations. 
In  particular,  strict  enforcement  of  the  properties  may  result  too  rigid  and,  like 
for  the  mandatory  policy,  there  will  be  the  need  for  exceptions  and  support  of 
sanitization  (which  is  mentioned,  but  not  investigated,  in  [22]).  Also,  the  enforce¬ 
ment  of  the  policies  requires  keeping  and  querying  the  history  of  the  accesses. 
A  further  point  to  take  into  consideration  is  to  ensure  that  the  enforcement  of 
the  properties  will  not  block  the  system  working.  For  instance,  if  in  a  system 
composed  of  ten  users  there  are  eleven  company  datasets  in  a  conflict  of  in¬ 
terest  class,  then  one  dataset  will  remain  inaccessible.  This  aspect  was  noticed 
in  [22],  where  the  authors  point  out  that  there  must  be  at  least  as  many  users 
as  the  maximum  number  of  datasets  which  appear  together  in  a  conflict  of  in¬ 
terest  class.  However,  while  this  condition  makes  the  system  operation  possible, 
it  cannot  ensure  it  when  users  are  left  completely  free  choice  on  the  datasets 
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access  attributes  therefore  control  information  flow.  When  a  new  value  of  some 
object  y  is  produced  as  a  function  of  objects  in  aq, . . .  ,  xn,  then  the  potential 
access  attribute  of  y  is  set  to  be  the  intersection  of  the  potential  access  attributes 
of  xi, . . . ,  xn. 

Walter  et  ah  [87]  propose  an  interpretation  of  the  mandatory  controls  within 
the  discretionary  context.  Intuitively,  the  policy  behind  this  approach,  which  we 
call  strict  policy,  is  based  on  the  same  principles  as  the  mandatory  policy.  Access 
control  lists  are  used  in  place  of  labels,  and  the  inclusion  relationship  between 
sets  is  used  in  place  of  the  dominance  relationship  between  labels.  Information 
flow  restrictions  impose  that  a  process  can  write  an  object  o  only  if  o  is  protected 
in  reading  at  least  as  all  the  objects  read  by  the  process  up  to  that  point.  (An 
object  o  is  at  least  as  protected  in  reading  as  another  object  o'  if  the  set  of 
subjects  allowed  to  read  o  is  contained  in  the  set  of  subjects  allowed  to  read 
o'.)  Although  the  discretionary  flexibility  of  specifying  accesses  is  not  lost,  the 
overall  flexibility  is  definitely  reduced  by  the  application  of  the  strict  policy.  After 
having  read  an  object  o,  a  process  is  completely  unable  to  write  any  object  less 
protected  in  reading  than  o,  even  if  the  write  operation  would  not  result  in  any 
improper  information  leakage. 

Bertino  et  al.  [14]  present  an  enhancement  of  the  strict  policy  to  introduce 
more  flexibility  in  the  policy  enforcement.  The  proposal  bases  on  the  observa¬ 
tion  that  whether  or  not  some  information  can  be  released  also  depends  on  the 
procedure  enacting  the  release.  A  process  may  access  sensitive  data  and  yet  not 
release  any  sensitive  information.  Such  a  process  should  be  allowed  to  bypass  the 
restrictions  of  the  strict  policy,  thus  representing  an  exception.  On  the  other  side, 
the  information  produced  by  a  process  may  be  more  sensitive  than  the  informa¬ 
tion  the  process  has  read.  An  exception  should  in  this  case  restrict  the  write 
actions  otherwise  allowed  by  the  strict  policy.  Starting  from  these  observations, 
Bertino  et  al.  [14]  allow  procedures  to  be  granted  exceptions  to  the  strict  policy. 
The  proposal  is  developed  in  the  context  of  object-oriented  systems,  where  the 
modularity  provided  by  methods  associated  with  objects  allows  users  to  identify 
specific  pieces  of  trusted  code  for  which  exceptions  can  be  allowed,  and  therefore 
provide  flexibility  in  the  application  of  the  control.  Exceptions  can  be  positive 
or  negative.  A  positive  exception  overrides  a  restriction  imposed  by  the  strict 
policy,  permitting  an  information  flow  which  would  otherwise  be  blocked.  A  neg¬ 
ative  exception  overrides  a  permission  stated  by  the  strict  policy  forbidding  an 
information  flow  which  would  otherwise  be  allowed.  Two  kinds  of  exceptions  are 
supported  by  the  model:  reply- exceptions  and  invoke- exceptions.  Reply  excep¬ 
tions  apply  to  the  information  returned  by  a  method.  Intuitively,  positive  reply 
exceptions  apply  when  the  information  returned  by  a  method  is  less  sensitive 
than  the  information  the  method  has  read.  Reply  exceptions  can  waive  the  strict 
policy  restrictions  and  allow  information  returned  by  a  method  to  be  disclosed 
to  users  not  authorized  to  read  the  objects  that  the  method  has  read.  Invoke 
exceptions  apply  during  a  method’s  execution,  for  write  operations  that  the 
method  requests.  Intuitively,  positive  invoke  exceptions  apply  to  methods  that 
are  trusted  not  to  leak  (through  write  operations  or  method  invocations)  the 


Access  Control:  Policies,  Models,  and  Mechanisms 


31 


PubliG 


Citizens  CS-Dept  Eng-Dept  Non-citizens 


^S-Faculty 
Jim  Mary  Jeremy 


reorge  Lucy  Mike  Sam 


Fig.  13.  An  example  of  user- group  hierarchy 


Fig.  14.  An  example  of  object  hierarchy 


reflect  the  logical  file  system  tree  structure,  while  in  object-oriented  system  it 
can  reflect  the  class  (is-a)  hierarchy.  Figure  14  illustrates  an  example  of  object 
hierarchy.  Even  actions  can  be  organized  hierarchically,  where  the  hierarchy  may 
reflect  an  implication  of  privileges  (e.g.,  write  is  more  powerful  than  read  [70]) 
or  a  grouping  of  sets  of  privileges  (e.g.,  a  “writing  privileges”  group  can  be  de¬ 
fined  containing  write,  append,  and  undo  [84]).  These  hierarchical  relationships 
can  be  exploited  i)  to  support  preconditions  on  accesses  (e.g.,  in  Unix  a  sub¬ 
ject  needs  the  execute,  x,  privilege  on  a  directory  in  order  to  access  the  files 
within  it),  or  ii)  to  support  authorization  implication,  that  is,  authorizations 
specified  on  an  abstraction  apply  to  all  its  members.  Support  of  abstractions 
with  implications  provides  a  short  hand  way  to  specify  authorizations,  clearly 
simplifying  authorization  management.  As  a  matter  of  fact,  in  most  situations 
the  ability  to  execute  privileges  depends  on  the  membership  of  users  into  groups 
or  objects  into  collections:  translating  these  requirements  into  basic  triples  of  the 
form  (user, object, action)  that  then  have  to  be  singularly  managed  is  a  consider¬ 
able  administrative  burden,  and  makes  it  difficult  to  maintain  both  satisfactory 
security  and  administrative  efficiency.  However,  although  there  are  cases  where 
abstractions  can  work  just  fine,  many  will  be  the  cases  where  exceptions  (i.e., 
authorizations  applicable  to  all  members  of  a  group  but  few)  will  need  to  be  sup¬ 
ported.  This  observation  has  brought  to  the  combined  support  of  both  positive 
and  negative  authorizations.  Traditionally,  positive  and  negative  authorizations 
have  been  used  in  mutual  exclusion  corresponding  to  two  classical  approaches 
to  access  control,  namely: 
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-  what  if  for  two  authorizations  the  most  specific  relationship  appear  reversed 
over  different  domains?  For  instance,  consider  authorizations  (CS-Faculty, 
read-b,  mail)  and  (CS-Dept,  read-,  personal);  the  first  has  a  more  specific 
subject,  while  the  second  has  a  more  specific  object  (see  Figures  13  and  14). 

A  slightly  alternative  policy  on  the  same  line  as  the  most  specific  policy 
is  what  in  [48]  is  called  most-specific-along-a-path-takes-precedence .  This  policy 
considers  an  authorization  specified  on  an  element  x  as  overriding  an  autho¬ 
rization  specified  on  a  more  general  element  y  only  for  those  elements  that  are 
members  of  y  because  of  x.  Intuitively,  this  policy  takes  into  account  the  fact 
that,  even  in  the  presence  of  a  more  specific  authorization,  the  more  general 
authorization  can  still  be  applicable  because  of  other  paths  in  the  hierarchy. 
For  instance,  consider  the  group  hierarchy  in  Figure  13  and  suppose  that  for 
an  access  a  positive  authorization  is  granted  to  Public  while  a  negative  autho¬ 
rization  is  granted  to  CS-Dept.  What  should  we  decide  for  George?  On  the  one 
side,  it  is  true  that  CS-Dept  is  more  specific  than  Public;  on  the  other  side, 
however,  George  belongs  to  Eng-Dept,  and  for  Eng-Dept  members  the  posi¬ 
tive  authorization  is  not  overridden.  While  the  most-specific-takes-precedence 
policy  would  consider  the  authorization  granted  to  Public  as  being  overridden 
for  George,  the  most-specific-along-a-path  considers  both  authorizations  as  ap¬ 
plicable  to  George.  Intuitively,  in  the  most-specific-along-a-path  policy,  an  au¬ 
thorization  propagates  down  the  hierarchy  until  overridden  by  a  more  specific 
authorization  [35]. 

The  most  specific  argument  does  not  always  apply.  For  instance,  an  organi¬ 
zation  may  want  to  be  able  to  state  that  consultants  should  not  be  given  access 
to  private  projects,  no  exceptions  allowed.  However,  if  the  most  specific  policy  is 
applied,  any  authorization  explicitly  granted  to  a  single  consultant  will  override 
the  denial  specified  by  the  organization.  To  address  situations  like  this,  some 
approaches  proposed  adopting  explicit  priorities .  In  ORION  [70],  authorizations 
are  classified  as  strong  or  weak :  weak  authorizations  override  each  other  based  on 
the  most-specific  policy,  and  strong  authorizations  override  weak  authorizations 
(no  matter  their  specificity)  and  cannot  be  overridden.  Given  that  strong  autho¬ 
rizations  must  be  certainly  obeyed,  they  are  required  to  be  consistent.  However, 
this  requirement  may  be  not  always  be  enforceable.  This  is,  for  example,  the 
case  where  groupings  are  not  explicitly  defined  but  depend  on  the  evaluation  of 
some  conditions  (e.g.,  “all  objects  owned  by  Tom”,  “all  objects  created  before 
1/1/01”).  Also,  while  the  distinction  between  strong  and  weak  authorizations 
is  convenient  in  many  situations  and,  for  example,  allows  us  to  express  the  or¬ 
ganizational  requirement  just  mentioned,  it  is  limited  to  two  levels  of  priority, 
which  may  not  be  enough.  Many  other  conflict  resolution  policies  can  be  applied. 
Some  approaches,  extending  the  strong  and  weak  paradigm,  proposed  adopting 
explicit  priorities ;  however,  these  solutions  do  not  appear  viable  as  the  autho¬ 
rization  specifications  may  result  not  always  clear.  Other  approaches  (e.g.,  [84]) 
proposed  making  authorization  priority  dependent  on  the  order  in  which  au¬ 
thorizations  are  listed  (i.e.,  the  authorizations  that  is  encountered  first  applies). 
This  approach,  however,  has  the  drawback  that  granting  or  removing  an  au- 
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specifications.  However,  the  complications  brought  by  negative  authorizations 
are  not  due  to  negative  authorizations  themselves,  but  to  the  different  semantics 
that  the  presence  of  permissions  and  denials  can  have,  that  is,  to  the  complex¬ 
ity  of  the  different  real  world  scenarios  and  requirements  that  may  need  to  be 
captured.  There  is  therefore  a  trade-off  between  expressiveness  and  simplicity. 
For  this  reason,  most  current  systems  adopting  negative  authorizations  for  ex¬ 
ception  support  impose  specific  conflict  resolution  policies,  or  support  a  limited 
form  of  conflict  resolution.  For  instance,  in  the  Apache  server  [6],  authorizations 
can  be  positive  and  negative  and  an  ordering  (“deny, allow”  or  “allow, deny”) 
can  be  specified  dictating  how  negative  and  positive  authorizations  are  to  be  in¬ 
terpreted.  In  the  “deny, allow”  order,  negative  authorizations  are  evaluated  first 
and  access  is  allowed  by  default  (open  policy) .  Any  client  that  does  not  match  a 
negative  authorization  or  matches  a  positive  authorization  is  allowed  access.  In 
the  “allow, deny”  order,  the  positive  authorizations  are  evaluated  first  and  access 
is  denied  by  default  (closed  policy).  Any  client  that  does  not  match  a  positive 
authorization  or  does  match  a  negative  authorization  will  be  denied  access. 


More  recent  approaches  are  moving  towards  the  development  of  flexible 
frameworks  with  the  support  of  multiple  conflict  resolution  and  decision  policies. 
We  will  examine  them  in  Section  8. 


Other  advancements  in  authorization  specification  and  enforcement  have 
been  carried  out  with  reference  to  specific  applications  and  data  models.  For 
instance,  authorization  models  proposed  for  object-oriented  systems  (e.g.,  [2, 35, 
71])  exploit  the  encapsulation  concept,  meaning  the  fact  that  access  to  objects  is 
always  carried  out  through  methods  (read  and  write  operations  being  primitive 
methods).  In  particular,  users  granted  authorizations  to  invoke  methods  can  be 
given  the  ability  to  successfully  complete  them,  without  need  to  have  the  au¬ 
thorizations  for  all  the  accesses  that  the  method  execution  entails.  For  instance, 
in  OSQL,  each  derived  function  (i.e.,  method)  can  be  specified  as  supporting 
static  or  dynamic  authorizations  [2].  A  dynamic  authorization  allows  the  user  to 
invoke  the  function,  but  its  successful  completion  requires  the  user  to  have  the 
authorization  for  all  the  calls  the  function  makes  during  its  execution.  With  a 
static  authorization,  calls  made  by  the  function  are  checked  against  the  creator 
of  the  function,  instead  of  those  of  the  calling  user.  Intuitively,  static  authoriza¬ 
tions  behave  like  the  setuid  (set  user  id)  option,  provided  by  the  Unix  operating 
system  that,  attached  to  a  program  (e.g.,  lpr)  implies  that  all  access  control 
checks  are  to  be  performed  against  the  authorizations  of  the  program’s  owner 
(instead  of  those  of  the  caller  as  it  would  otherwise  be).  A  similar  feature  is  also 
proposed  in  [71],  where  each  method  is  associated  with  a  principal,  and  accesses 
requested  during  a  method  execution  are  checked  against  the  authorization  of 
the  method’s  principal.  Encapsulation  is  also  exploited  by  the  Java  2  security 
model  [83]  where  authorizations  can  be  granted  to  code,  and  requests  to  access 
resources  are  checked  against  the  authorizations  of  the  code  directly  attempting 
the  access. 
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derivability  of  A  if  R  is  an  ASLONGAS  rule 
derivability  of  A  if  R  is  an  UPON  rule 
derivability  of  A  if  R  is  a  WHENEVER  rule 


Fig.  16.  Semantics  of  the  different  temporal  operators  [13] 


guarantees  efficient  access.  The  model  is  focussed  on  time-based  constraints  and 
reasoning  and  allows  expressing  authorization  relationships  and  derivation  not 
covered  in  other  models.  However,  it  does  not  address  the  enforcement  of  dif¬ 
ferent  implication  and  conflict  resolution  policies  (conflicts  between  permissions 
and  denials  are  solved  according  to  the  denials-take-precedence  policy). 


6.3  A  calculus  for  access  control 


Abadi  et  al.  [1]  present  a  calculus  for  access  control  that  combines  authentication 
(i.e.,  identity  check)  and  authorization  control,  taking  also  into  account  possible 
delegation  of  privileges  among  parties.  The  calculus  is  based  on  the  notion  of 
principals.  Principals  axe  sources  of  requests  and  make  statements  (e.g.,  “read 
file  tmp”).  Principals  can  be  either  simple  (e.g.,  users,  machines,  and  commu¬ 
nication  channels)  or  composite.  Composite  principals  are  obtained  combining 
principals  by  means  of  constructors  that  allow  to  capture  groups  and  delega- 
tions.a  Principals  can  be  as  follows  [1]: 


—  Users  and  machines . 

—  Channels ,  such  as  input  devices  and  cryptographic  channels. 

—  Conjunction  of  principals ,  of  the  form  A  A  B.  A  request  from  A  A  B  is  a 

request  that  both  A  and  B  make  (it  is  not  necessary  that  the  request  be 
made  in  concert).  J;- 

—  Groups ,  define  groups  of  principals,  membership  of  principal  Pi  in  group  Gi 
is  written  Pi  =>  Gi.  Disjunction  A  V  B  denotes  a  group  composed  only  of 
A  and  B. 
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-  Ownership:  Each  object  is  associated  with  an  owner,  who  generally  coincides 
with  the  user  who  created  the  object.  Users  can  grant  and  revoke  authoriza¬ 
tions  on  the  objects  they  own. 

-  Decentralized:  Extending  the  previous  approaches,  the  owner  of  an  object 
(or  its  administrators)  can  delegate  other  users  the  privilege  of  specifying 
authorizations,  possibly  with  the  ability  of  further  delegating  it. 

Decentralized  administration  is  convenient  since  it  allows  users  to  delegate 
administrative  privileges  to  others.  Delegation,  however,  complicates  the  autho¬ 
rization  management.  In  particular,  it  becomes  more  difficult  for  users  to  keep 
track  of  who  can  access  their  objects.  Furthermore,  revocation  of  authorizations 
becomes  more  complex.  There  are  many  possible  variations  on  the  way  decentral¬ 
ized  administration  works,  which  may  differ  in  the  way  the  following  questions 
are  answered. 

-  what  is  the  granularity  of  administrative  authorizations? 

-  can  delegation  be  restricted,  that  is,  can  the  grantor  of  an  administrative 
authorization  impose  restrictions  on  the  subjects  to  which  the  recipient  can 
further  grant  the  authorization? 

-  who  can  revoke  authorizations? 

-  what  about  authorizations  granted  by  the  revokee? 

In  general,  existing  decentralized  policies  allow  users  to  grant  administra¬ 
tion  for  a  specific  privilege  (meaning  a  given  access  on  a  given  object).  They 
do  not  allow,  however,  to  put  constraints  on  the  subjects  to  which  the  recipi¬ 
ent  receiving  administrative  authority  can  grant  the  access.  This  feature  could, 
however,  result  useful.  For  instance,  an  organization  could  delegate  one  of  its 
employees  to  granting  access  to  some  resources  constraining  the  authorizations 
she  can  grant  to  employees  working  within  her  laboratory.  Usually,  authoriza¬ 
tions  can  be  revoked  only  by  the  user  who  granted  them  (or,  possibly,  by  the 
object’s  owner).  When  an  administrative  authorization  is  revoked,  the  problem 
arises  of  dealing  with  the  authorizations  specified  by  the  users  from  whom  the 
administrative  privilege  is  being  revoked.  For  instance,  suppose  that  Ann  gives 
Bob  the  authorization  to  read  Filel  and  gives  him  the  privilege  of  granting  this 
authorization  to  others  (in  some  systems,  such  capability  of  delegation  is  called 
grant  option  [42]).  Suppose  then  that  Bob  grants  the  authorization  to  Chris, 
and  susequently  Ann  revokes  the  authorization  from  Bob.  The  question  now  is: 
what  should  happen  to  the  authorization  that  Chris  has  received?  To  illustrate 
how  revocation  can  work  it  is  useful  to  look  at  the  history  of  System  R  [42].  In 
the  System  R  authorization  model,  users  creating  a  table  can  grant  other  users 
access  privileges  on  it.  Authorizations  can  be  granted  with  the  grant- option.  If 
a  user  receives  the  authorization  for  an  access  with  the  grant-option  she  can 
grant  the  access  (and  the  grant  option  on  it)  to  others.  Intuitively,  this  intro¬ 
duces  a  chain  of  authorizations.  The  original  System  R  policy,  which  we  call 
(time-based)  cascade  revocation,  adopted  the  following  semantics  for  revocation: 
when  a  user  is  revoked  the  grant  option  on  an  access,  all  authorizations  that 
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the  organization.  Suppose  there  is  a  change  in  the  task  or  function  of  a  user  (say, 
because  of  a  job  promotion).  This  change  may  imply  a  change  in  the  responsibili¬ 
ties  of  the  user  and  therefore  in  her  privileges.  New  authorizations  will  be  granted 
to  the  user  and  some  of  her  previous  authorizations  will  be  revoked.  Applying  a 
recursive  revocation  will  result  in  the  undesirable  effect  of  deleting  all  authoriza¬ 
tions  the  revokee  granted  and,  recursively,  all  the  authorizations  granted  through 
them,  which  then  will  need  to  be  re-issued.  Moreover,  all  application  programs 
depending  on  the  revoked  authorizations  will  be  invalidated.  An  alternative  form 
of  revocation  was  proposed  in  [15],  where  non-cascade  revocation  is  introduced. 
Instead  of  deleting  all  the  authorizations  granted  by  the  revokee  in  virtue  of  the 
authorizations  being  revoked,  non-recursive  revocation  re-specifies  them  to  be 
under  the  authority  of  the  revoker,  which  can  then  retain  or  selectively  delete 
them.  The  original  time-based  revocation  policy  of  System  R,  was  changed  to 
not  consider  time  anymore.  In  SQL:  1999  [28]  revocation  can  be  requested  with 
or  without  cascade.  Cascade  revocation  recursively  deletes  authorizations  if  the 
revokee  does  not  hold  anymore  the  grant  option  for  the  access.  However,  if  the 
revokee  still  holds  the  grant  option  for  the  access,  the  authorizations  she  granted 
are  not  deleted  (regardless  of  time  they  were  granted).  For  instance,  with  ref¬ 
erence  to  Figure  17(a),  the  revocation  by  Bob  of  the  authorization  granted  to 
David,  would  only  delete  the  authorization  granted  to  David  by  Bob.  Ellen’s 
authorization  would  still  remain  valid  since  David  still  holds  the  grant  option 
of  the  access  (because  of  the  authorization  from  Chris).  With  the  non  cascade 
option  the  system  rejects  the  revoke  operation  if  its  enforcement  would  entail 
deletion  of  other  authorizations  beside  the  one  for  which  revocation  is  requested. 


6.5  Integrity  policies 

In  Section  4.4  we  illustrated  a  mandatory  policy  (namely  Biba’s  model)  for 
protecting  information  integrity.  Biba’s  approach,  however,  suffers  of  two  major 
drawbacks:  i)  the  constraints  imposed  on  the  information  flow  may  result  too 
restrictive,  and  ii)  it  only  controls  integrity  intended  as  the  prevention  of  a  flow 
of  information  from  low  integrity  objects  to  high  integrity  objects.  However,  this 
notion  of  one-directional  information  flow  in  a  lattice  captures  only  a  small  part 
of  the  data  integrity  problem  [74]. 

Integrity  is  concerned  with  ensuring  that  no  resource  (including  data  and 
programs9)  has  been  modified  in  an  unauthorized  or  improper  way  and  that  the 
data  stored  in  the  system  correctly  reflect  the  real  world  they  are  intended  to 
represent  (i.e.,  that  users  expect).  Integrity  preservation  requires  prevention  of 
frauds  and  errors,  as  the  term  “improper”  used  above  suggests:  violations  to 
data  integrity  are  often  enacted  by  legitimate  users  executing  authorized  actions 
but  misusing  their  privileges. 

Any  data  management  system  today  has  functionalities  for  ensuring  in¬ 
tegrity  [8].  Basic  integrity  services  are,  for  example,  concurrency  control  (to 

9  Programs  improperly  modified  can  fool  the  access  control  and  bypass  the  system 
restrictions,  thus  violating  the  secrecy  and/or  integrity  of  the  data  (see  Section  3). 
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Cl:  All  I  VPs  must  ensure  that  all  CDIs  are  in  a  valid  state  when  the  IVP  is  run. 

C2:  All  TPs  must  be  certified  to  be  valid  (i.e.,  preserve  validity  of  CDIs’  state) 

C3:  Assignment  of  TPs  to  users  must  satisfy  separation  of  duty 

C4:  The  operations  of  TPs  must  be  logged 

C5:  TPs  execute  on  UDIs  must  result  in  valid  CDIs 

El:  Only  certified  TPs  can  manipulate  CDIs 

E2:  Users  must  only  access  CDIs  by  means  of  TPs  for  which  they  are  authorized 
E3:  The  identity  of  each  user  attempting  to  execute  a  TP  must  be  authenticated 
E4:  Only  the  agent  permitted  to  certify  entities  can  change  the  list  of  such  entities 
associated  with  other  entities  _ _ 


Fig.  18.  Clark  and  Wilson  integrity  rules 


-  Constrained  Data  Items .  CDIs  are  the  objects  whose  integrity  must  be  safe¬ 
guarded. 

-  Unconstrained  Data  Items.  UDIs  are  objects  that  are  not  covered  by  the 
integrity  policy  (e.g.,  information  typed  by  the  user  on  the  keyboard). 

-  Integrity  Verification  Procedures.  IVPs  are  procedures  meant  to  verify  that 
CDIs  are  in  a  valid  state,  that  is,  the  IVPs  confirm  that  the  data  conforms 
to  the  integrity  specifications  at  the  time  the  verification  is  performed. 

-  Transformation  Procedures.  TPs  are  the  only  procedures  (well-formed  pro¬ 
cedures)  that  are  allowed  to  modify  CDIs  or  to  take  arbitrary  user  input  and 
create  new  CDIs.  TPs  are  designed  to  take  the  system  from  one  valid  state 
to  the  next 

Intuitively,  IVPs  and  TPs  are  the  means  for  enforcing  the  well-formed  trans¬ 
action  requirement:  all  data  modifications  must  be  carried  out  through  TPs,  and 
the  result  must  satisfy  the  conditions  imposed  by  the  IVPs. 

Separation  of  duty  must  be  taken  care  of  in  the  definition  of  authorized  op¬ 
erations.  In  the  context  of  the  Clark  and  Wilson’s  model,  authorized  operations 
are  specified  by  assigning  to  each  user  a  set  of  well-formed  transactions  that  she 
can  execute  (which  have  access  to  constraint  data  items) .  Separation  of  duty  re¬ 
quires  the  assignment  to  be  defined  in  a  way  that  makes  it  impossible  for  a  user 
to  violate  the  integrity  of  the  system.  Intuitively,  separation  of  duty  is  enforced 
by  splitting  operations  in  subparts,  each  to  be  executed  by  a  different  person  (to 
make  frauds  difficult).  For  instance,  any  person  permitted  to  create  or  certify 
a  well-formed  transaction  should  not  be  able  to  execute  it  (against  production 
data). 

Figure  18  summarizes  the  nine  rules  that  Clark  and  Wilson  presented  for 
the  enforcement  of  system  integrity.  The  rules  are  partitioned  into  two  types: 
certification  (C)  and  enforcement  (E).  Certification  rules  involve  the  evaluation 
of  transactions  by  an  administrator,  whereas  enforcement  is  performed  by  the 
system. 

The  Clark  and  Wilson’s  proposal  outlines  good  principles  for  controlling  in¬ 
tegrity.  However,  it  has  limitations  due  to  the  fact  that  it  is  far  from  formal  and 
it  is  unclear  how  to  formalize  it  in  a  general  setting. 
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Fig.  19.  An  example  of  NPD  privilege  graph  [9] 


Although  different  proposals  have  been  made  (e.g.,  [3,36,45,56,67,76,80]),  the 
basic  concepts  are  common  to  all  approaches.  Essentially,  role  based  policies 
require  the  identification  of  roles  in  the  system,  where  a  role  can  be  defined  as  a 
set  of  actions  and  responsibilities  associated  with  a  particular  working  activity. 
The  role  can  be  widely  scoped,  reflecting  a  user’s  job  title  (e.g.,  secretary),  or 
it  can  be  more  specific,  reflecting,  for  example,  a  task  that  the  user  needs  to 
perform  (e.g.,  order  .processing).  Then,  instead  of  specifying  all  the  accesses 
each  users  is  allowed  to  execute,  access  authorizations  on  objects  are  specified 
for  roles.  Users  are  then  given  authorizations  to  adopt  roles  (see  Figure  20). 
The  user  playing  a  role  is  allowed  to  execute  all  accesses  for  which  the  role  is 
authorized.  In  general,  a  user  can  take  on  different  roles  on  different  occasions. 
Also  the  same  role  can  be  played  by  several  users,  perhaps  simultaneously.  Some 
proposals  for  role-based  access  control  (e.g.,  [76,80])  allow  a  user  to  exercise 
multiple  roles  at  the  same  time.  Other  proposals  (e.g.,  [28,48])  limit  the  user  to 
only  one  role  at  a  time,  or  recognize  that  some  roles  can  be  jointly  exercised  while 
others  must  be  adopted  in  exclusion  to  one  another.  It  is  important  to  note  the 
difference  between  groups  (see  Section  6)  and  roles:  groups  define  sets  of  users 
while  roles  define  sets  of  privileges.  There  is  a  semantic  difference  between  them: 
roles  can  be  “activated”  and  “deactivated”  by  users  at  their  discretion,  while 
group  membership  always  applies,  that  is,  users  cannot  enable  and  disable  group 
memberships  (and  corresponding  authorizations)  at  their  will.  However,  since 
roles  can  be  defined  which  correspond  to  organizational  figures  (e.g.,  secretary, 
chair,  and  faculty),  a  same  “concept”  can  be  seen  both  as  a  group  and  as  a 
role. 

The  role-based  approach  has  several  advantages.  Some  of  these  are  discussed 
below. 

Authorization  management  Role-based  policies  benefit  from  a  logical  inde¬ 
pendence  in  specifying  user  authorizations  by  breaking  this  task  into  two 
parts:  i)  assignement  of  roles  to  users,  and  ii)  assignement  of  authorizations 
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Separation  of  duties  Separation  of  duties  refer  to  the  principle  that  no  user 
should  be  given  enough  privileges  to  misuse  the  system  on  their  own.  For  in¬ 
stance,  the  person  authorizing  a  paycheck  should  not  be  the  same  person  who 
can  prepare  them.  Separation  of  duties  can  be  enforced  either  statically  (by 
defining  conflicting  roles,  that  is,  roles  which  cannot  be  executed  by  the  same 
user)  or  dynamically  (by  enforcing  the  control  at  access  time).  An  example 
of  dynamic  separation  of  duty  restriction  is  the  two-person  rule.  The  first 
user  to  execute  a  two-person  operation  can  be  any  authorized  user,  whereas 
the  second  user  can  be  any  authorized  user  different  from  the  first  [79]. 
Constraints  enforcement  Roles  provide  a  basis  for  the  specification  and  en¬ 
forcement  of  further  protection  requirements  that  real  world  policies  may 
need  to  express.  For  instance,  cardinality  constraints  can  be  specified,  that 
restrict  the  number  of  users  allowed  to  activate  a  role  or  the  number  of  roles 
allowed  to  exercise  a  given  privilege.  The  constraints  can  also  be  dynamic, 
that  is,  be  imposed  on  roles  activation  rather  than  on  their  assignment.  For 
instance,  while  several  users  may  be  allowed  to  activate  role  chair,  a  further 
constraint  can  require  that  at  most  one  user  at  a  time  can  activate  it. 

Role-based  policies  represent  a  promising  direction  and  a  useful  paradigm 
for  many  commercial  and  government  organizations.  However,  there  is  still  some 
work  to  be  done  to  cover  all  the  different  requirements  that  real  world  scenarios 
may  present.  For  instance,  the  simple  hierarchical  relationship  as  intended  in 
current  proposals  may  not  be  sufficient  to  model  the  different  kinds  of  relation¬ 
ships  that  can  occur.  For  example,  a  secretary  may  need  to  be  allowed  to  write 
specific  documents  on  behalf  of  her  manager,  but  neither  role  is  a  specialization 
of  the  other.  Different  ways  of  propagating  privileges  (delegation)  should  then 
be  supported.  Similarly,  administrative  policies  should  be  enriched.  For  instance, 
the  traditional  concept  of  ownership  may  not  apply  anymore:  a  user  does  not 
necessarily  own  the  objects  she  created  when  in  a  given  role.  Also,  users5  identi¬ 
ties  should  not  be  forgotten.  If  it  true  that  in  most  organizations,  the  role  (and 
not  the  identity)  identifies  the  privileges  that  one  may  execute,  it  is  also  true 
that  in  some  cases  the  requestor’s  identity  needs  to  be  considered  even  when  a 
role-based  policy  is  adopted.  For  instance,  a  doctor  may  be  allowed  to  specify 
treatments  and  access  files  but  she  may  be  restricted  to  treatments  and  files 
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flexibility  and  extensibility  in  access  specifications  and  illustrates  how  these  ad¬ 
vantages  can  be  achieved  by  abstracting  from  the  low  level  authorization  triples 
and  adopting  a  high  level  authorization  language.  Their  language  is  essentially  a 
many-sorted  first-order  language  with  a  rule  construct,  useful  to  express  autho¬ 
rization  derivations  and  therefore  model  authorization  implications  and  default 
decisions  (e.g.,  closed  or  open  policy).  The  use  of  a  very  general  language,  which 
has  almost  the  same  expressive  power  of  first  order  logic,  allows  the  expression  of 
different  kinds  of  authorization  implications,  constraints  on  authorizations,  and 
access  control  policies.  However,  as  a  drawback,  authorization  specifications  may 
result  difficult  to  understand  and  manage.  Also,  the  trade-off  between  expres¬ 
siveness  and  efficiency  seems  to  be  strongly  unbalanced:  the  lack  of  restrictions 
on  the  language  results  in  the  specification  of  models  which  may  not  even  be 
decidable  and  therefore  will  not  be  implementable.  As  noted  in  [48],  Woo  and 
Lam’s  approach  is  based  on  truth  in  extensions  of  arbitrary  default  theories, 
which  is  known,  even  in  the  propositional  case  to  be  NP-complete,  and  in  the 
first  order  case,  is  worse  than  undecidable. 

Starting  from  these  observations,  Jajodia  et  al.  [48]  worked  on  a  proposal  for 
a  logic-based  language  that  attempted  to  balance  flexibility  and  expressiveness 
on  the  one  side,  and  easy  management  and  performance  on  the  other.  The  lan¬ 
guage  allows  the  representation  of  different  policies  and  protection  requirements, 
while  at  the  same  time  providing  understandable  specifications,  clear  semantics 
(guaranteeing  therefore  the  behavior  of  the  specifications),  and  bearable  data 
complexity.  Their  proposal  for  a  Flexible  Authorization  Framework  (FAF)  iden¬ 
tifies  a  polynomial  time  (in  fact  quadratic  time)  data  complexity  fragment  of 
default  logic;  thus  resulting  effectively  implementable.  The  language  identifies 
the  following  predicates  for  the  specification  of  authorizations.  (Below  s,o,  and 
a  denote  a  subject,  object,  and  action  term,  respectively,  where  a  term  is  either 
a  constant  value  in  the  corresponding  domain  or  a  variable  ranging  over  it) . 


cando(o,s,(sipn)a)  represents  authorizations  explicitly  inserted  by  the  security 
administrator.  They  represent  the  accesses  that  the  administrator  wishes  to 
allow  or  deny  (depending  on  the  sign  associated  with  the  action). 
dercando(o,s,(s«0n)a)  represents  authorizations  derived  by  the  system  us¬ 
ing  logical  rules  of  inference  (modus  ponens  plus  rules  for  stratified  nega¬ 
tion).  Logical  rules  can  express  hierarchy-based  authorization  derivation 
(e.g.,  propagation  of  authorizations  from  groups  to  their  members)  as  well 
as  different  implication  relationships  that  may  need  to  be  represented. 
do(o,s,(szpra)a)  definitely  represents  the  accesses  that  must  be  granted  or  de¬ 
nied.  Intuitively,  do  enforces  the  conflict  resolution  and  access  decision  poli¬ 
cies,  that  is,  it  decides  whether  to  grant  or  deny  the  access  possibly  solving 
existing  conflicts  and  enforcing  default  decisions  (in  the  case  where  no  au¬ 
thorization  has  been  specified  for  an  access). 
done(o,s,r,a,t)  keeps  the  history  of  the  accesses  executed.  A  fact  of  the  form 
done(o,s,r ,a,t)  indicates  that  s  operating  in  role  r  executed  action  a  on 
object  o  at  time  t. 
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Stratum 

Predicate 

Rules  defining  predicate 

0 

hie-predicates 

rel-predicates 

done 

base  relations, 
base  relations, 
base  relation. 

1 

cando 

body  may  contain  done,  hie- 
and  rel-literals. 

2 

dercando 

body  may  contain  cando,  dercando,  done, 
hie-,  and  rel-  literals.  Occurrences  of 
dercando  literals  must  be  positive. 

3 

do 

in  the  case  when  head  is  of  the  form 
do(_,  -ha)  body  may  contain  cando, 
dercando,  done,  hie-  and  rel-  literals. 

4 

do 

in  the  case  when  head  is  of  the  form 
do (o,  s,  —a)  body  contains  just  one  literal 
^do(o,  s,  -ha). 

5 

error 

body  may  contain  do,  cando,  dercando,  done, 
hie-,  and  rel-  literals. 

Fig.  22.  Rule  composition  and  stratification  of  the  proposal  in  [48] 


8.2  Composition  of  access  control  policies 

In  many  real  world  situations,  access  control  needs  to  combine  restrictions  in¬ 
dependently  stated  that  should  be  enforced  as  one,  while  retaining  their  in¬ 
dependence  and  administrative  autonomy.  For  instance,  the  global  policy  of  a 
large  organization  can  be  the  combination  of  the  policies  of  its  different  depart¬ 
ments  and  divisions  as  well  as  of  externally  imposed  constraints  (e.g.,  privacy 
regulations);  each  of  these  policies  should  be  taken  into  account  while  remaining 
independent  and  autonomously  managed.  Another  example  is  represented  by  the 
emerging  dynamic  coalition  scenarios  where  different  parties,  coming  together 
for  a  common  goal  for  a  limited  time,  need  to  merge  their  security  requirements 
in  a  controlled  way  while  retaining  their  autonomy.  Since  existing  frameworks 
assume  a  single  monolithic  specification  of  the  entire  access  control  policy,  the 
situations  above  would  require  translating  and  merging  the  different  component 
policies  into  a  single  “program”  in  the  adopted  access  control  language.  While 
existing  languages  are  flexible  enough  to  obtain  the  desired  combined  behavior, 
this  method  has  several  drawbacks.  First,  the  translation  process  is  far  from 
being  trivial;  it  must  be  done  very  carefully  to  avoid  undesirable  side  effects  due 
to  interference  between  the  component  policies.  Interference  may  result  in  the 
combined  specifications  not  reflecting  correctly  the  intended  restrictions.  Second, 
after  translation  it  is  not  possible  anymore  to  operate  on  the  individual  compo¬ 
nents  and  maintain  them  autonomously.  Third,  existing  approaches  cannot  take 
into  account  incomplete  policies,  where  some  components  are  not  (completely) 
known  a  priori  (e.g.,  when  somebody  else  is  to  provide  that  component).  Start¬ 
ing  from  these  observations,  Bonatti  et  al.  [20]  make  the  point  for  the  need  of 
a  policy  composition  framework  by  which  different  component  policies  can  be 
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different  component,  provides  a  convenient  way  for  reasoning  about  policies  at 
different  levels  of  abstractions.  Also,  it  allows  for  the  support  of  heterogeneous 
policies  and  policies  that  are  unknown  a  priori  and  can  only  be  queried  at  access 
control  time. 

8.3  Certificate-based  access  control 

Today’s  Globally  Internetworked  Infrastructure  connects  remote  parties  through 
the  use  of  large  scale  networks,  such  as  the  World  Wide  Web.  Execution  of  ac¬ 
tivities  at  various  levels  is  based  on  the  use  of  remote  resources  and  services, 
and  on  the  interaction  between  different,  remotely  located,  parties  that  may 
know  little  about  each  other.  In  such  a  scenario,  traditional  assumptions  for 
establishing  and  enforcing  access  control  regulations  do  not  hold  anymore.  For 
instance,  a  server  may  receive  requests  not  just  from  the  local  community  of 
users,  but  also  from  remote,  previously  unknown  users.  The  server  may  not  be 
able  to  authenticate  these  users  or  to  specify  authorizations  for  them  (with  re¬ 
spect  to  their  identity).  The  traditional  separation  between  authentication  and 
access  control  cannot  be  applied  in  this  context,  and  alternative  access  control 
solutions  should  be  devised.  A  possible  solution  to  this  problem  is  represented 
by  the  use  of  digital  certificates  (or  credentials),  representing  statements  cer¬ 
tified  by  given  entities  (e.g.,  certification  authorities),  which  can  be  used  to 
establish  properties  of  their  holder  (such  as  identity,  accreditation,  or  autho¬ 
rizations)  [39].  Trust-management  systems  (e.g.,  PolicyMaker  [18],  Keynote  [17], 
REFEREE  [24],  and  DL  [57])  use  credentials  to  describe  specific  delegation  of 
trusts  among  keys  and  to  bind  public  keys  to  authorizations.  They  therefore  de¬ 
part  from  the  traditional  separation  between  authentication  and  authorizations 
by  granting  authorizations  directly  to  keys  (bypassing  identities).  Trust  man¬ 
agement  systems  provide  an  interesting  framework  for  reasoning  about  trust 
between  unknown  parties;  however,  assigning  authorizations  to  keys,  may  result 
limiting  and  make  authorization  specifications  difficult  to  manage.  A  promising 
direction  exploiting  digital  certificates  to  regulate  access  control  is  represented 
by  new  authorization  models  making  the  access  decision  of  whether  or  not  a 
party  may  execute  an  access  dependent  on  properties  that  the  party  may  have, 
and  can  prove  by  presenting  one  or  more  certificates  (authorization  certificates 
in  [18]  being  a  specific  kind  of  them).  Besides  a  more  complex  authorization 
language  and  model,  there  is  however  a  further  complication  arising  in  this  new 
scenario,  due  to  the  fact  that  the  access  control  paradigm  is  changing.  On  the 
one  side,  the  server  may  not  have  all  the  information  it  needs  in  order  to  decide 
whether  or  not  an  access  should  be  granted  (and  exploits  certificates  to  take  the 
decision).  On  the  other  side,  however,  the  requestor  may  not  know  which  certifi¬ 
cates  she  needs  to  present  to  a  (possibly  just  encountered)  server  in  order  to  get 
access.  Therefore,  the  server  itself  should,  upon  reception  of  the  request,  return 
the  user  with  the  information  of  what  she  should  do  (if  possible)  to  get  access. 
In  other  words  the  system  cannot  simply  return  a  “yes/no”  access  decision  any¬ 
more.  Rather,  it  should  return  the  information  of  the  requisites  that  it  requires 
be  satisfied  for  the  access  to  be  allowed.  The  certificates  mentioned  above  are 
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in  [88, 93]  investigating  trust  negotiation  issues  and  strategies  that  a  party  can 
apply  to  select  credentials  to  submit  to  the  opponent  party  in  a  negotiation.  In 
particular,  [88]  distinguishes  between  eager  and  parsimonious  credential  release 
strategies.  Parties  applying  the  first  strategy  turn  over  all  their  credentials  if 
the  release  policy  for  them  is  satisfied,  without  waiting  for  the  credentials  to  be 
requested.  Parsimonious  parties  only  release  credentials  upon  explicit  request  by 
the  server  (avoiding  unnecessary  releases).  Yu  et  al.  [93]  present  a  prudent  nego¬ 
tiation  strategy  to  the  goal  of  establishing  trust  among  parties,  while  avoiding 
disclosure  of  irrelevant  credentials. 

A  credential-based  access  control  is  also  presented  by  Bonatti  and  Samarati 
in  [21].  They  propose  a  uniform  framework  for  regulating  service  access  and  in¬ 
formation  disclosure  in  an  open,  distributed  network  system  like  the  Web.  Like  in 
previous  proposals,  access  regulations  are  specified  as  logical  rules,  where  some 
predicates  are  explicitly  identified.  Besides  credentials,  the  proposal  also  allows 
to  reason  about  declarations  (i.e.,  unsigned  statements)  and  user-profiles  that  the 
server  can  maintain  and  exploit  for  taking  the  access  decision.  Communication 
of  requisites  to  be  satisfied  by  the  requestor  is  based  on  a  filtering  and  renaming 
process  applied  on  the  server’s  policy,  which  exploits  partial  evaluation  tech¬ 
niques  in  logic  programs.  The  filtering  process  allows  the  server  to  communicate 
to  the  client  the  requisites  for  an  access,  without  disclosing  possible  sensitive  in¬ 
formation  on  which  the  access  decision  is  taken.  The  proposal  allows  also  clients 
to  control  the  release  of  their  credentials,  possibly  making  counter-requests  to  the 
server,  and  releasing  certain  credentials  only  if  their  counter-requests  are  satis¬ 
fied  (see  Figure  23) .  Client-server  interplay  is  limited  to  two  interactions  to  allow 
clients  to  apply  a  parsimonious  strategy  (i.e.,  minimizing  the  set  of  information 
and  credentials  released)  when  deciding  which  set  credentials/declarations  re¬ 
lease  among  possible  alternative  choices  they  may  have. 

While  all  these  approaches  assume  access  control  rules  to  be  expressed  in 
logic  form,  often  the  people  specifying  the  security  policies  are  unfamiliar  with 
logic  based  languages.  An  interesting  aspect  to  be  investigated  concerns  the 
definition  of  a  language  for  expressing  and  exchanging  policies  based  on  a  high 
level  formulation  that,  while  powerful,  can  be  easily  interchangeable  and  both 
human  and  machine  readable.  Insights  in  this  respect  can  be  taken  from  recent 
proposals  expressing  access  control  policies  as  XML  documents  [26,27]. 

All  the  proposals  above  open  new  interesting  directions  in  the  access  control 
area. 
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In  October  of  1999,  the  Infosec  Research  Council  created  a  Science 
and  Technology  Study  Group  focused  on  malicious  code.  The  Mali¬ 
cious  Code  ISTSG  is  charged  with  developing  a  national  research 
agenda  to  address  the  accelerating  threat  from  malicious  code.  The 
study  is  intended  to  identify  promising  new  approaches  to  dealing  with  the 
problems  posed  by  malicious  code.  In  this  report,  we  discuss  important 

trends  that  are  making  malicious  code  an  the  system’s  intended  function.  Although 
increasingly  serious  problem.  We  then  sur-  the  problem  of  malicious  code  has  a  long 
vey  existing  techniques  for  preventing  at-  history,  a  number  of  recent,  widely  publi- 
tacks,  pointing  out  their  limitations,  and  cized  attacks  and  certain  economic  trends 
discuss  some  promising  new  approaches  suggest  that  malicious  code  is  rapidly  be- 
that  might  address  these  limitations.  coming  a  critical  problem  for  industry,  gov- 

This  report  is  a  byproduct  of  two  meetings  ernment,  and  individuals, 
of  Study  Group  members  and  their  invited  Traditional  examples  of  malicious  code 
guests.  Although  this  report  was  written  by  include  viruses,  worms,  Trojan  Horses,  and 
two  of  the  study  group  members,  we  believe  attack  scripts,  while  more  modern  examples 

it  represents  an  accurate  distillation  of  the  include  Java  attack  applets  and  dangerous 

ideas  and  insights  of  all  the  participants.  ActiveX  controls: 

What  is  MaliciOUS  Code?  ■  Viruses  are  pieces  of  malicious  code  that 

Malicious  code  is  any  code  added,  attach  to  host  programs  and  propagate 
changed,  or  removed  from  a  software  sys-  when  an  infected  program  executes, 
tem  to  intentionally  cause  harm  or  subvert  ■  Worms  are  particular  to  networked 
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Extensible 
systems, 
including 
computers,  are 
particularly 
susceptible  to 
the  malicious 
functionality 
problem. 


computers.  Instead  of  attaching  them¬ 
selves  to  a  host  program,  worms  carry 
out  programmed  attacks  to  jump  from 
machine  to  machine  across  the  network. 

■  Trojan  Horses,  like  viruses,  hide  mali¬ 
cious  intent  inside  a  host  program  that 
appears  to  do  something  useful  (such  as 
a  program  that  captures  passwords  by 
masquerading  as  the  login  daemon). 

■  Attack  scripts  are  programs  written  by 
experts  that  exploit  security  weaknesses, 
usually  across  the  network,  to  carry  out 
an  attack.  Attack  scripts  exploiting  buffer 
overflows  by  “smashing  the  stack”  are 
the  most  commonly  encountered  variety. 

■  Java  attack  applets  are  programs  em¬ 
bedded  in  Web  pages  that  achieve  foot¬ 
hold  through  a  Web  browser. 

■  Dangerous  ActiveX  controls  are  pro¬ 
gram  components  that  allow  a  mali¬ 
cious  code  fragment  to  control  applica¬ 
tions  or  the  operating  system. 

Recently,  the  distinctions  between  mali¬ 
cious  code  categories  have  been  bleeding  to¬ 
gether,  making  classification  difficult.  Table  1 
provides  some  concrete  examples  of  malicious 
code.  Recent  versions  of  malicious  code  are  re¬ 
ally  amalgamations  of  different  categories. 

A  Growing  Problem 

Complex  devices,  by  their  very  nature, 
introduce  the  risk  that  malicious  functional¬ 
ity  can  be  added  (either  during  creation  or 
afterwards)  that  extends  the  original  device 
past  its  primary  intended  design.  As  an  un¬ 
fortunate  side  effect,  inherent  complexity 
lets  malicious  subsystems  remain  invisible 
to  unsuspecting  users  until  it  is  too  late. 
Some  of  the  earliest  malicious  functionality, 
for  example,  was  associated  with  compli¬ 
cated  copy  machines.  Extensible  systems, 
including  computers,  are  particularly  sus¬ 
ceptible  to  the  malicious  functionality  prob¬ 
lem.  When  extending  a  system  is  as  easy  as 
writing  and  installing  a  program,  the  risk  of 
intentional  introduction  of  malicious  behav¬ 
ior  increases  drastically. 

Any  computing  system  is  susceptible  to 
malicious  code.  Rogue  programmers  can 
modify  systems  software  that  is  initially  in¬ 
stalled  on  the  machine.  Users  might  unwit¬ 
tingly  propagate  a  virus  by  installing  new 
programs  or  software  updates  from  a 
CDROM.  In  a  multi-user  system,  a  hostile 


user  might  install  a  Trojan  Horse  to  collect 
other  users’  passwords.  These  attack  vectors 
have  been  well  known  since  the  dawn  of 
computing,  so  why  is  malicious  code  a  big¬ 
ger  problem  now  than  in  the  past?  We  argue 
that  a  small  number  of  trends  have  a  large 
influence  on  the  recent  widespread  propaga¬ 
tion  of  malicious  code. 

Networks  Are  Everywhere 

The  growing  connectivity  of  computers 
through  the  Internet  has  increased  both  the 
number  of  attack  vectors  and  the  ease  with 
which  an  attack  can  be  made.  More  and 
more  computers,  ranging  from  home  PCs  to 
systems  that  control  critical  infrastructures 
(such  as  the  power  grid),  are  being  con¬ 
nected  to  the  Internet.  Furthermore,  people, 
businesses,  and  governments  are  increasingly 
dependent  upon  network-enabled  communi¬ 
cation  such  as  e-mail  or  Web  pages  provided 
by  information  systems.  Unfortunately,  as 
these  systems  are  connected  to  the  Internet, 
they  become  vulnerable  to  attacks  from  dis¬ 
tant  sources.  Put  simply,  an  attacker  no 
longer  needs  physical  access  to  a  system  to 
install  or  propagate  malicious  code. 

Because  access  through  a  network  does 
not  require  human  intervention,  launching 
automated  attacks  from  the  comfort  of  your 
living  room  is  relatively  easy.  Indeed,  the  re¬ 
cent  denial-of-service  attacks  in  February  of 
2000  took  advantage  of  a  number  of  (previ¬ 
ously  compromised)  hosts  to  flood  popular 
e-commerce  Web  sites  with  bogus  requests 
automatically.  The  ubiquity  of  networking 
means  that  there  are  more  systems  to  attack, 
more  attacks,  and  greater  risks  from  mali¬ 
cious  code  than  in  the  past. 

System  Complexity  Is  Rising 

A  second  trend  that  has  enabled  wide¬ 
spread  propagation  of  malicious  code  is  the 
size  and  complexity  of  modern  information 
systems.  A  desktop  system  running  Win¬ 
dows/NT  and  associated  applications  de¬ 
pends  upon  the  proper  functioning  of  the 
kernel  as  well  as  the  applications  to  ensure 
that  malicious  code  cannot  corrupt  the  sys¬ 
tem.  However,  NT  itself  consists  of  tens  of 
millions  of  lines  of  code,  and  applications 
are  becoming  equally,  if  not  more,  complex. 
When  systems  become  this  large,  bugs  can¬ 
not  be  avoided.  Exacerbating  this  problem 
is  the  use  of  unsafe  programming  languages 
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(such  as  C  or  C++)  that  do  not  protect 
against  simple  kinds  of  attacks,  such  as 
buffer  overflows.  However  even  if  the  sys¬ 
tems  and  applications  code  were  bug  free, 
improper  configuration  by  retailers,  admin¬ 
istrators,  or  users  can  open  the  door  to  ma¬ 
licious  code.  In  addition  to  providing  more 
avenues  for  attack,  complex  systems  make  it 
easier  to  hide  or  mask  malicious  code.  In 
theory,  we  could  analyze  and  prove  that  a 
small  program  was  free  of  malicious  code, 
but  this  task  is  impossible  for  even  the  sim¬ 
plest  desktop  systems  today,  much  less  the 
enterprise-wide  systems  used  by  businesses 
or  governments. 

Systems  Are  Easily  Extensible 

A  third  trend  enabling  malicious  code  is 
the  degree  to  which  systems  have  become 
extensible.  An  extensible  host  accepts  up¬ 
dates  or  extensions,  sometimes  referred  to 
as  mobile  code ,  so  that  the  system’s  func¬ 
tionality  can  be  evolved  in  an  incremental 
fashion.  For  example,  the  plug-in  architec¬ 
ture  of  Web  browsers  makes  it  easy  to  in¬ 
stall  viewer  extensions  for  new  document 
types  as  needed.  Today’s  operating  systems 
support  extensibility  through  dynamically 
loadable  device  drivers  and  modules.  To¬ 
day’s  applications,  such  as  word  processors, 
e-mail  clients,  spreadsheets,  and  Web 
browsers,  support  extensibility  through 


scripting,  controls,  components,  and  ap¬ 
plets.  From  an  economic  standpoint,  exten¬ 
sible  systems  are  attractive  because  they 
provide  flexible  interfaces  that  can  be 
adapted  through  new  components.  In  to¬ 
day’s  marketplace,  it  is  crucial  that  software 
be  deployed  as  rapidly  as  possible  to  gain 
market  share.  Yet  the  marketplace  also  de¬ 
mands  that  applications  provide  new  fea¬ 
tures  with  each  release.  An  extensible  archi¬ 
tecture  makes  it  easy  to  satisfy  both  de¬ 
mands  by  letting  companies  ship  the  base 
application  code  early  and  later  ship  feature 
extensions  as  needed. 

Unfortunately,  the  very  nature  of  extensi¬ 
ble  systems  makes  it  hard  to  prevent  mali¬ 
cious  code  from  slipping  in  as  an  unwanted 
extension.  For  example,  the  Melissa  virus 
took  advantage  of  the  scripting  extensions  of 
Microsoft’s  Outlook  e-mail  client  to  propa¬ 
gate  itself.  The  virus  was  coded  as  a  script 
contained  in  what  appeared  to  users  as  an 
innocuous  mail  message.  When  users  opened 
the  message,  the  script  executed,  proceeded 
to  obtain  email  addresses  from  the  user’s 
contacts  database,  and  then  sent  copies  of  it¬ 
self  to  those  addresses.  The  infamous  Love 
Bug  worked  very  similarly,  also  taking  ad¬ 
vantage  of  Outlook’s  scripting  capabilities. 

Defense  against  Malicious  Code 

Creating  malicious  code  is  not  hard.  In 
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fact,  it  is  as  simple  as  writing  a  program  or 
downloading  and  configuring  a  set  of  easily 
customized  components.  It  is  becoming  in¬ 
creasingly  easy  to  hide  ill-intentioned  code 
inside  otherwise  innocuous  objects,  including 
Web  pages  and  e-mail  messages.  This  makes 
detecting  and  stopping  malicious  code  before 
it  can  do  any  damage  extremely  hard. 

To  make  matters  worse,  our  traditional 
tools  for  ensuring  the  security  and  integrity 
of  hosts  have  not  kept  pace  with  the  ever- 
changing  suite  of  applications.  For  example, 
traditional  security  mechanisms  for  access 
control  reside  within  an  operating  system 
kernel  and  protect  relatively  primitive  ob¬ 
jects  (such  as  files);  but  increasingly,  attacks 
such  as  the  Melissa  virus  happen  at  the  ap¬ 
plication  level  where  the  kernel  has  no  op¬ 
portunity  to  intervene. 

A  useful  analogy  is  to  think  of  today’s 
computer  and  network  security  mechanisms 
like  the  walls,  moats,  and  drawbridges  of 
medieval  times.  At  one  point,  these  mecha¬ 
nisms  were  effective  for  defending  our  com¬ 
puting  castles  against  isolated  attacks, 
mounted  on  horseback.  But  the  defenses 
have  not  kept  pace  with  the  attacks.  Today, 
attackers  have  access  to  airplanes  and  laser- 
guided  bombs  that  can  easily  bypass  our  an¬ 
tiquated  defenses.  In  fact,  attackers  rarely 
need  sophisticated  equipment:  because  our 
kingdoms  are  really  composed  of  hundreds 
of  interconnected  castles,  attackers  can  eas¬ 
ily  move  from  site  to  site,  finding  places 
where  we  have  left  the  drawbridge  down.  It 
is  time  to  develop  some  new  defenses. 

In  general,  when  a  computational  agent 
arrives  at  a  host,  there  are  four  approaches 
that  the  host  can  take  to  protect  itself: 

■  Analyze  the  code  and  reject  it  if  there  is 
the  potential  that  executing  it  will  cause 
harm. 

■  Rewrite  the  code  before  executing  it  so 
that  it  can  do  no  harm. 

■  Monitor  the  code  while  its  executing 
and  stop  it  before  it  does  harm,  or 

■  Audit  the  code  during  executing  and 
take  policing  action  if  it  did  some  harm. 

Code  analysis  includes  simple  techniques, 
such  as  scanning  a  file  and  rejecting  it  if  con¬ 
tains  any  known  virus,  as  well  as  more  so¬ 
phisticated  techniques,  including  dataflow 
analysis,  which  can  sometimes  discover  pre¬ 


viously  unseen  malicious  code.  Analysis  can 
also  help  locate  security-related  bugs  (such  as 
potential  buffer  overflow  conditions)  that 
malicious  code  can  use  to  gain  a  foothold  in 
a  system.  But  analyses  are  necessarily  limited, 
because  determining  if  code  will  misbehave  is 
as  hard  as  the  halting  problem.  Conse¬ 
quently,  any  analysis  will  either  be  too  con¬ 
servative  (and  reject  some  perfectly  good 
code)  or  too  permissive  (and  let  some  bad 
code  in)  or  more  likely,  both.  Furthermore, 
software  engineers  working  on  their  own  sys¬ 
tems  often  neglect  to  apply  any  bug-finding 
analyses.  Automated  tools  such  as  the  open 
source  security  scanner  ITS4  (see  www.rst- 
corp.com/its4)  and  more  sophisticated  tools 
incorporating  dataflow  analysis  can  be  effec¬ 
tive  for  finding  bugs.1*2  In  addition,  primitive 
static  analysis,  such  as  looking  for  particular 
patterns  of  system  calls  in  an  executable,  has 
been  incorporated  into  some  commercially 
available  security  products. 

Code  rewriting  is  a  less  pervasive  ap¬ 
proach  to  the  problem,  but  might  become 
more  important  (see  the  next  section).  With 
this  approach,  a  rewriting  tool  inserts  extra 
code  to  perform  dynamic  checks  that  ensure 
bad  things  cannot  happen.  For  example,  a 
Java  compiler  inserts  code  to  check  that 
each  array  index  is  in  bounds — if  not,  the 
code  throws  an  exception,  thereby  avoiding 
the  common  class  of  buffer  overflow  at¬ 
tacks.  Rewriting  can  be  carried  out  either  at 
the  application  code  level,  or  below  that  in 
subsystem  functionality  made  available 
through  APIs,  or  even  at  the  binary  level. 

Monitoring  programs,  using  a  reference 
monitor,  is  the  traditional  approach  used  to 
ensure  programs  do  not  do  anything  bad.  For 
instance,  an  operating  system  uses  the  page- 
translation  hardware  to  monitor  the  set  of  ad¬ 
dresses  that  an  application  attempts  to  read, 
write,  or  execute.  If  the  application  attempts 
to  access  memory  outside  of  its  address  space, 
the  kernel  takes  action  (such  as  by  signaling  a 
segmentation  fault.)  A  more  recent  example  of 
an  online  reference  monitor  is  the  Java  Virtual 
Machine  interpreter.  The  interpreter  monitors 
execution  of  applets  and  mediates  access  to 
system  calls  by  examining  the  execution  stack 
to  determine  who  is  issuing  the  system  call  re¬ 
quest.3  In  this  case,  stack  inspection  serves  as  a 
policy  enforcement  mechanism. 

If  malicious  code  does  damage,  recovery 
is  only  possible  if  the  damage  can  be  prop- 
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erly  assessed  and  addressed.  Creating  an  au¬ 
dit  trail  that  captures  program  behavior  is 
an  essential  step.  Several  program-auditing 
tools  are  commercially  available. 

Each  of  the  basic  approaches — analysis, 
rewriting,  monitoring,  and  auditing — has  its 
strengths  and  weaknesses,  but  fortunately, 
these  approaches  are  not  mutually  exclusive 
and  can  be  used  in  concert.  Of  course,  to 
employ  any  of  them,  we  must  first  identify 
what  could  be  “harmful”  to  a  host.  Like 
any  other  computing  task,  we  must  turn  the 
vague  idea  of  “harm”  into  a  concrete,  de¬ 
tailed  specification — a  security  policy — so 
that  it  can  be  enforced  by  some  automated 
security  architecture.  Therein  lies  our  great¬ 
est  danger,  for  as  we  create  the  policy,  we 
are  likely  to  abstract  or  forget  relevant  de¬ 
tails  of  the  system.  An  attacker  will  turn  to 
these  details  first,  stepping  outside  our  pol¬ 
icy  model  to  circumvent  the  safeguards. 

Stick  to  Your  Principles 

To  protect  against  this  common  failing,  it 
is  important  to  follow  well-established  secu¬ 
rity  principles  when  designing  security  poli¬ 
cies.  One  of  the  most  important  principles, 
first  stated  by  Jerome  Saltzer  and  Michael 
Shroeder  in  1975,4  is  the  Principle  of  Least 
Privilege:  a  component  should  be  given  the 
minimum  access  necessary  to  accomplish  its 
intended  task.  For  example,  we  shouldn’t 
give  a  program  access  to  all  files  in  a  system 
but  rather,  only  those  files  that  the  program 
needs  to  get  its  job  done.  This  prevents  the 
program  from  either  accidentally  or  mali¬ 
ciously  deleting  or  corrupting  most  files. 
Obviously,  the  fewer  files  that  the  program 
can  access,  the  less  the  potential  damage. 
Stated  simply,  tighter  constraints  on  a  pro¬ 
gram  lead  to  better  security. 

Another  important  security  principle  is 
the  Principle  of  Minimum  Trusted  Comput¬ 
ing  Base.  The  trusted  computing  base  (TCB) 
is  the  set  of  hardware  and  software  compo¬ 
nents  that  make  up  our  security  enforcement 
mechanisms.  The  Principle  of  Minimum 
TCB  states  that,  in  general,  the  best  way  to 
assure  that  your  system  is  secure  is  to  keep 
your  TCB  small  and  simple.  Even  in  the  mid 
1970’s,  operating  system  kernels  were 
thought  to  be  too  large  to  be  trusted.  Those 
systems  now  seem  small  and  tightly  struc¬ 
tured  compared  to  today’s  widely  used  ker¬ 
nels  composed  of  millions  of  lines  of  code. 


Current  Defenses 

We  now  turn  to  examples  of  currently 
deployed  defenses  for  malicious  code,  focus¬ 
ing  on  their  relative  pros  and  cons.  Unfortu¬ 
nately,  the  comparison  shows  that  the  pros 
are  outweighed  by  the  cons,  largely  because 
of  a  violation  of  the  Least  Privilege  and 
Minimal  TCB  principles. 

OS-Based  Reference  Monitors 

Historically,  mechanisms  for  security  pol¬ 
icy  enforcement  have  been  provided  by  the 
computer  hardware  and  operating  system. 
Address  translation  hardware,  distinct  su¬ 
pervisor-  and  user-modes,  timer  interrupts, 
and  system  calls  for  invoking  a  trusted  soft¬ 
ware  base  serve  in  combination  to  enforce 
limited  forms  of  availability,  fault  contain¬ 
ment,  and  authorization  properties. 

To  a  large  degree,  these  mechanisms  have 
proven  effective  for  protecting  operating  sys¬ 
tem  resources  (such  as  files  or  devices)  from 
unauthorized  access  by  humans  or  malicious 
code.  But  the  mechanisms  work  with  a  fixed 
system-call  interface  and  a  fixed  vocabulary 
of  principals,  objects,  and  operations  for 
policies.  Only  by  incurring  significant  cost 
and  usability  penalties  can  that  vocabulary 
be  expanded.  It  rarely  is.  Currently,  most 
desktop  machines  are  configured  as  single- 
user,  so  applications  have  complete  access  to 
the  machine  resources. 

Scanning  for  Known  Malicious  Code 

In  the  days  before  networking  was  ram¬ 
pant,  malicious  code  mostly  used  the  “sneaker 
net”  as  its  vector  Viruses  spread  from  machine 
to  machine  by  humans  carrying  floppy  disks 
with  infected  programs  on  them.  Perhaps  the 
built-in  limitations  in  the  vector  kept  the  num¬ 
ber  of  viruses  small.  In  any  case,  the  limited 
number  of  viruses  combined  with  the  ineffi¬ 
ciencies  in  the  communication  vector  made 
possible  the  strategy  of  blacklisting. 

Blacklisting,  a  strategy  used  by  most  com¬ 
mercial  antivirus  products,  matches  pro¬ 
grams  against  a  database  of  known  virus  sig¬ 
natures  (such  as  code  fragments).  If  a  match 
is  found,  the  program  is  disabled.  Today, 
commercial  products  scan  not  only  binary 
programs,  but  also  email  messages,  Web 
pages,  or  documents  looking  for  viruses  in 
the  form  of  scripts.  This  approach’s  limita¬ 
tions  are  obvious.  Unknown  malicious  code 
will  easily  get  by  the  simple  defenses  to  carry 
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out  its  dirty  work.  Until  vendors  can  contain 
a  new  virus  and  add  a  signature  entry  to  the 
database,  it  can  run  rampant.  Recall  body 
the  Melissa  virus  and  the  Love  Bug.  Another 
limitation  of  the  approach  is  that  it  does  not 
scale  well;  each  file  must  be  scanned  against 
an  ever  growing  list  of  viruses. 

Clearly,  blacklisting  by  itself  does  not 
provide  adequate  security.  It  is  too  easy  to 
make  trivial  changes  to  malicious  code  (a 
process  that  can  be  automated  in  the  code 
itself)  to  thwart  almost  every  black  listing 
scheme.  Nevertheless,  black  listing  is  cheap 
to  implement  and  is  thus  worthwhile  even  if 
it  only  stops  the  occasional  naive  attack. 

Code  Signing 

Code  signing  is  an  approach  for  authen¬ 
ticating  code  based  on  public-key  cryptog¬ 
raphy  and  digital  signatures.  The  digital  sig¬ 
nature  lets  a  user  determine  which  particu¬ 
lar  key  the  code  was  signed  with  and  ensure 
(with  high  probability)  that  the  code  has  not 
been  tampered  with  since  it  was  signed. 

Unfortunately,  most  people  assume  that 
digital  signatures  imply  a  lot  more  than  they 
really  do.  In  particular;  people  typically  as¬ 
sume  that  the  signed  code  was  signed  by  the 
owner  of  the  key,  that  the  owner  of  the  key 
wrote  the  code,  that  the  code  is  good,  and 
that  the  code  may  be  safely  used  in  any  con¬ 
text.  But  these  assumptions  are  often  not  true! 

For  instance,  if  a  key  is  stolen,  anyone 
can  use  it  to  sign  any  piece  of  code — includ¬ 
ing  malicious  code.  As  another  example,  the 
developer  might  consider  the  code  to  be 
“good”  and  thus  sign  it,  even  if  the  code 
contains  a  Trojan  horse  or  virus.  Finally, 
what  developers  or  retailers  consider  to  be 
good  might  not  be  good  for  the  user:  A 
component  that  sends  back  information  to 
the  home  office  may  seem  useful  to  a  ven¬ 
dor,  but  will  probably  be  considered  a  vio¬ 
lation  of  privacy  by  the  user. 

Thus,  while  code  signing  is  a  useful  tech¬ 
nology,  it  suffers  from  some  real  limitations 
not  the  least  of  which  is  poor  understanding 
of  what  a  digital  signature  really  means. 
Furthermore,  the  adoption  of  code  signing 
has  been  hampered  by  the  lack  of  a  Public 
Key  Infrastructure.  Very  few  PKI  installa¬ 
tions  have  been  deployed,  and  those  that 
have  do  not  begin  to  approach  Internet 
scale.  Without  a  solid  PKI,  code  signing  will 
not  become  common. 


Promising  New  Defenses 

Now  we’ll  discuss  some  promising  tech¬ 
nologies,  identified  by  the  study  group,  that 
are  emerging  from  research  labs. 

Software-Based  Reference  Monitors 

Robert  Wahbe  and  his  colleagues  sug¬ 
gested  software-based  fault  isolation  as  an 
alternative  to  the  traditional  hardware- 
based  mechanisms  used  to  ensure  memory 
safety.5  Their  goal  was  to  reduce  the  over¬ 
head  of  cross-domain  procedure  calls  and 
providing  a  more  flexible  memory-safety 
mode.  Their  basic  idea  is  to  rewrite  binary 
code  by  inserting  checks  on  each  memory 
access  and  each  control  transfer  to  ensure 
that  those  accesses  are  valid.  Fred  Schneider 
generalized  the  SFI  idea  to  in-lined  reference 
monitors.6  With  the  IRM  approach,  a  secu¬ 
rity  policy  is  specified  in  a  declarative  lan¬ 
guage,  and  a  general-purpose  tool  rewrites 
code,  inserting  extra  checks  and  state  that 
serve  to  enforce  the  policy.  In  principle,  any 
security  policy  that  is  a  safety  property  can 
be  enforced,  so  the  approach  is  quite  pow¬ 
erful.  For  example,  it  can  enforce  any  dis¬ 
cretionary  access  control  policy.  The  ap¬ 
proach  is  also  practical:  Prototypes  have 
been  built  at  both  Cornell  and  MIT.7- 9  One 
of  the  Cornell  prototypes,  PSLang/PoET, 
works  for  the  Java  Virtual  Machine  lan¬ 
guage  and  gives  competitive  performance 
for  the  implementation  of  Java’s  stack  in¬ 
spection  security  policy. 

IVpe-Safe  Languages 

Type-safe  programming  languages,  such 
as  Java,  Scheme,  or  ML,  ensure  that  opera¬ 
tions  are  only  applied  to  values  of  the  ap¬ 
propriate  type.  Type  systems  that  support 
type  abstraction  let  programmers  specify 
new,  abstract  types  and  signatures  for  oper¬ 
ations  that  prevent  unauthorized  code  from 
applying  the  wrong  operations  to  the  wrong 
values.  In  this  respect,  type  systems,  like 
software-based  reference  monitors,  go  be¬ 
yond  operating  systems  in  that  they  can  be 
used  to  enforce  a  wider  class  of  application- 
specific  access  policies.  Static  type  systems 
also  enable  offline  enforcement  through 
static  type  checking  instead  of  each  time  a 
particular  operation  is  performed.  This  lets 
the  type  checker  enforce  certain  policies  that 
are  difficult  with  online  techniques.  For  ex- 


6  IEEE  SOFTWARE  Sepfember/Gctobsr  2000 


;?aMa  2 


Examples  of  malicious  codo  understood  in  our  policy-based  framework. 


mmwm 

PgM 

1  •  ,V-‘-V"’  '  - 


m— a 

ESWom 

r  %  '  -  %  <  &: 


ample,  Andrew  Myers’  Jflow  extends  the 
Java  type  system  to  enforce  the  policy  that 
high-security  data  should  never  be  leaked.10 
Current  research  in  type  systems  is  aimed  at 
eliminating  more  run-time  checks  (such  as 
array  bounds  checks11)  or  type-checking 
machine  code12. 

Proof-Carrying  Code 

Proof-carrying  code  (PCC),  a  concept  in¬ 
troduced  by  George  Necula  and  Peter  Lee,13 
is  a  promising  approach  for  gaining  high  as¬ 
surance  of  security  in  systems.  The  basic  idea 
is  to  require  any  untrusted  code  to  come 
equipped  with  an  explicit,  machine-check¬ 
able  proof  that  the  code  respects  a  given  se¬ 
curity  policy.  Before  executing  the  code,  we 
simply  verify  that  the  proof  is  valid  with  re¬ 
spect  to  both  the  code  and  the  policy.  Be¬ 
cause  proof  checkers  can  be  quite  simple 
(Necula’s  is  about  six  pages  of  C  code),  it  is 
easier  to  ensure  that  they  are  correct.  And  in 
principle,  PCC  can  enforce  any  security  pol¬ 
icy — not  just  type  safety — as  long  as  the 
code  producer  can  construct  a  proof.  Necula 
and  Lee  have  shown  that  such  proofs  can  be 
constructed  automatically  for  standard  type- 
safety  policies,  if  a  compiler  for  a  type-safe 
programming  language  generates  the  code. 
Unfortunately,  going  beyond  standard  no¬ 
tions  of  type  safety  cannot  be  performed  au¬ 
tomatically  without  either  restricting  the 
code  or  requiring  human  intervention.  It  is 
unlikely  that  programmers  will  construct  ex¬ 
plicit  proofs.  Thus  an  active  area  of  research 
is  how  to  integrate  compilers  and  modern 
theorem  provers  to  produce  PCC. 

Policy  as  Achilles*  Heel 

Thus  far,  we  have  focused  on  technology 
solutions  to  the  malicious  code  problem.  To 
be  sure,  technology  can  be  of  service;  but 
there  is  another  critical  aspect  of  the  prob¬ 
lem  that  remains  to  be  addressed — the  prob¬ 
lem  of  policy. 

In  current  forms,  extensible  systems  do 
little  to  determine  how  a  system  will  behave 
when  extended  in  certain  ways  or,  put  an¬ 


other  way,  what  a  particular  piece  of  code 
can  and  cannot  do.  In  fact,  today’s  comput¬ 
ers  are  hyper-malleable  and  overly  compli¬ 
cated.  This  greatly  increases  the  malicious 
code  risk.  In  the  end,  determining  whether 
something  malicious  is  happening  requires 
first  defining  some  policy  to  enforce. 

When  Policy  Breaks  Down 

Clearly,  the  notion  of  policy  is  deeply  in¬ 
tertwined  with  the  concept  of  malicious 
code.  Understood  in  terms  of  policy,  the 
root  causes  of  malicious  code  fall  into  two 
basic  categories:  bad  policy  and  incorrectly 
enforced  policy. 

Bad  policy  allows  malicious  code  to  do 
something  malicious  because  policy  does 
not  forbid  it.  Even  if  policy  is  perfectly  en¬ 
forced  by  technology,  the  policy  itself  must 
be  well  formed.  Subcategories  of  bad  policy 
include: 

■  misunderstandings  of  context,  whereby 
policy  makes  no  sense  in  the  context 
where  it  was  applied; 

■  over  restriction,  whereby  the  policy  pre¬ 
vents  useful  work  when  it  is  enforced;  or 

■  noncomprehensiveness,  whereby  policy 
fails  to  cover  some  situation  or  exists  at 
the  wrong  level  of  abstraction. 

Incorrect  policy  enforcement  allows  code 
to  do  something  malicious  even  if  it  is  cor¬ 
rectly  forbidden  by  policy.  This  situation 
arises  when  either 

■  the  enforcement  mechanism  is  too  weak 
to  implement  the  desired  policy; 

■  there  are  bugs  in  the  implementation  of 
the  enforcement  mechanism;  or 

■  the  enforcement  mechanism  is  miscon- 
figured. 

Table  2  provides  examples  of  malicious 
code  understood  in  our  policy-based  frame¬ 
work. 

As  an  example  of  context  misunder¬ 
standing,  consider  the  role  of  scripting  lan- 
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guages  such  as  Visual  Basic.  Such  languages 
can  be  can  be  extremely  useful  and  perfectly 
safe  in  some  contexts.  But  in  other  contexts, 
scripts  can  be  extremely  dangerous.  For  in¬ 
stance,  there  is  rarely  a  need  for  scripts  or 
macros  to  be  run  when  displaying  a  docu¬ 
ment,  yet  this  functionality  is  exactly  what 
the  Melissa  virus  exploited. 

An  example  of  a  policy  that  is  too  restric¬ 
tive  is  one  in  which  users  are  required  to  pick 
new  passwords  at  small  intervals.  Under  such 
a  policy,  people  often  forget  their  current 
password.  To  avoid  this,  they  may  write 
down  their  password  in  an  insecure  place, 
making  it  easier  for  an  attacker  to  steal  it. 

Most  security  policies  fail  to  be  comprehen¬ 
sive,  simply  because  designers  cannot  think  of 
all  possible  attacks.  For  instance,  in  the  early 
days  of  the  Internet,  there  was  no  need  for  an 
e-mail  security  policy,  because  mail  readers  did 
not  interpret  messages.  Today,  messages  can 
contain  attachments  or  scripts  that  are  auto¬ 
matically  executed  by  readers. 

Many  desirable  security  policies  just 
aren’t  achievable  or  practical.  For  instance, 
stringent  policies  have  been  formulated  for 
smart  cards  to  prevent  disclosure  of  private 
or  secret  information,  such  as  health  records 
or  crypto  keys.  But  hiding  information  is  a 
tricky  business  and  just  about  any  enforce¬ 
ment  mechanism  will  fail  to  block  all  infor¬ 
mation  flow  off  the  card.  In  the  case  of 
smart  cards,  the  designers  used  clever  algo¬ 
rithms  and  packaging  techniques  to  prevent 
tampering  with  the  card  to  learn  private  in¬ 
formation.  But  they  failed  to  take  into  ac¬ 
count  of  the  power  fluctuations  across  the 
connection  pins — data  that  can  be  used  to 
reconstruct  private  information.14 

Sometimes  an  enforcement  mechanism  is 
powerful  enough  to  implement  the  policy, 
but  its  implementation  has  bugs  or  weak¬ 
nesses  that  prevent  it  from  doing  so.  The 


classic  example  of  such  bugs  are  the  buffer 
overflow  attacks  that  arise  in  operating  sys¬ 
tems  and  applications. 

Finally,  sometimes  the  enforcement 
mechanism  is  powerful  enough  and  coded 
properly,  but  simply  misconfigured.  For  ex¬ 
ample,  the  sendmail  program  has  debugging 
features  that  allow  a  programmer  to  gain 
remote  access  to  a  machine.  During  devel¬ 
opment,  this  feature  was  turned  on.  Unfor¬ 
tunately,  the  feature  remained  on  when 
sendmail  was  deployed  and  subsequent  at¬ 
tacks  such  as  the  Morris  worm  took  advan¬ 
tage  of  the  opening. 

Addressing  the  malicious  code  problem 
requires  the  creation  of  sound  policy  and  its 
careful  enforcement  through  technology. 

The  Many  Levels  of  Policy 

System  administrators  and  MIS  security 
people  think  about  policy  in  terms  of  user 
groups,  firewall  rules,  and  computer  use. 
Security  researchers  steeped  in  program¬ 
ming  languages  think  about  policy  in  terms 
of  memory  safety  and  liveness  properties. 
Government  policy  wonks  think  about  pol¬ 
icy  in  terms  of  rules  and  regulations  im¬ 
posed  on  users  and  systems.  The  problem  is, 
all  of  these  ways  of  thinking  about  policy 
are  equally  valid! 

So  how  are  we  to  set  policy  to  combat  ma¬ 
licious  code?  We  believe  the  key  is  to  focus 
on  defining  metalevel  policies  that  system  ad¬ 
ministrators  work  with  naturally  in  terms  of 
collections  of  lower-level  enforcement  mech¬ 
anisms.  This  is  no  trivial  undertaking. 

Most  of  the  technologies  we’ve  explored 
earlier  in  this  article  can  serve  to  enforce 
particular  aspects  of  software  behavior. 
Some  language  researchers,  for  example, 
consider  the  issue  of  enforcing  safety  prop¬ 
erties  “solved,”  at  least  in  theory.  Enforcing 
liveness  properties  or  confidentiality  is 
harder,  but  fairly  clear  research  agendas  ex¬ 
ist  to  address  the  open  issues.  Of  course,  the 
terms  safety ,  liveness ,  and  confidentiality 
have  technical  meanings.  Intuitively,  a 
safety  property  states  that  a  program  will 
never  perform  a  bad  action,  for  some  pre¬ 
cisely  defined  notion  of  “bad.”  An  example 
of  a  bad  action  is  overflowing  a  buffer.  A 
liveness  property,  on  the  other  hand,  states 
that  a  program  will  eventually  perform 
some  desired  action  or  set  of  actions.  For 
example,  the  property  that  a  program  will 
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eventually  release  all  of  the  memory  that  it 
allocates  is  a  liveness  property.  Finally,  con¬ 
fidentiality  is  meant  to  ensure  that  certain 
values  remain  private  or  secret. 

The  problem  is  that  low-level  properties 
such  as  safety  and  liveness  do  not  align  nicely 
with  what  most  security  administrators  think 
of  as  policy  building  blocks.  Thus  an  open 
question  is  how  to  express  reasonable  secu¬ 
rity  policy  that  can  be  directly  transformed 
into  technology  enforcement  solutions. 

The  answer  is  to  understand  policy  as  a 
layered  set  of  abstractions.  Some  prelimi¬ 
nary  work  exists  (for  example  Netscape 
Navigator’s  approach  to  policy  sets  based 
on  expected  code  behavior),  but  much  work 
remains  to  be  done. 


he  malicious  code  problem  will 
continue  to  grow  as  the  Internet 
grows.  The  constantly  accelerating 
of  interconnectedness,  complexity, 
and  extensibility  make  addressing  the  prob¬ 
lem  more  urgent  than  ever.  As  extensible  in¬ 
formation  systems  become  more  ubiqui¬ 
tous,  moving  into  everyday  devices  and 
playing  key  roles  in  life-critical  systems,  the 
level  of  the  threat  moves  out  of  the  techni¬ 
cal  world  and  into  the  real  world.  We  must 
work  on  this  problem. 

Our  best  hope  in  combating  malicious 
code  is  creating  sound  policy  about  soft¬ 
ware  behavior  and  enforcing  that  policy 
through  the  use  of  technology.  An  emphasis 
on  one  or  the  other  alone  will  do  little  to 
help.  Any  answer  will  require  a  set  of  en¬ 
forcement  technologies  that  can  be  directly 
tied  to  policy  set  and  understood  by  non¬ 
technical  users. 


Acknowledgements 

Study  Group  members  include  Gary  McGraw, 
Cigital,  Chain  Avi  Rubin,  AT&T  Research;  Ed  Fel- 
ten,  Princeton;  Peter  G.  Neumann,  SRI;  Lee  Badger, 
NAI  Labs;  Greg  Morrisett,  Cornell;  Tim  Teitelbaum, 
Grammatech;  Virgil  Gligor,  University  of  Maryland; 
Tom  Markham,  Secure  Computing;  Jay  Lepreau,  Uni¬ 
versity  of  Utah;  Bob  Balzer,  ISI;  Joshua  Haines,  Lin¬ 
coln  Labs;  Roger  Thompson,  ICSA.net;  Bob  Clemons, 
NSA;  Penny  Chase,  MITRE;  Carl  Landwehr, 

Mitretek;  Brad  Arkin,  Reliable  Software  Technolo¬ 
gies;  Sami  Saydjari,  DARPA;  Brian  Witten,  DARPA; 
and  Dave  Thompson,  Mitretek.  Guests  who  partici¬ 
pated  in  the  two  day  San  Antonio  workshop  include 
mudge,  the  lOpht;  Crispin  Cowen,  Wirex;  Fred 


our  best  nope  in 
combating 
malicious  code 
is  creating 
sound  nolicg 
about  software 
behavior  and 
enforcing  that 
policy  through 
the  use  of 
technology. 


Schneider,  Cornell;  Peter  Lee,  CMU;  Richard  Smith, 
pharlap;  John  Rushby,  SRI;  Dan  Wallach,  Rice  Uni¬ 
versity;  Amy  Felty,  University  of  Ottawa;  and  David 
Evans,  University  of  Virginia. 

The  workshops  on  which  this  report  is  based  were 
convened  under  the  auspices  of  the  Infosec  Research 
Council  (IRC),  with  members  from  US  Government 
organizations  that  sponsor  and  conduct  information 
security  research.  Views  expressed  in  the  report  are 
those  of  the  authors  and  may  not  reflect  those  of  the 
IRC,  its  members,  or  the  organizations  they  represent. 


References 

1.  J.  Viega  et  al.,  “ITS4:  A  Static  Vulnerability  Scanner  for 
C  and  C++  Code,  ”  To  appear  in  Proc.  Ann.  Computer 
Secutiry  Applications  Conf.  2000,  IEEE  Computer  Soc. 
Press,  Los  Alamitos,  Calif:  the  ITS4  tool  is  available  at 
www.rstcorp.com/its4. 

2.  D.  Wagner  et  al.,  “A  First  Step  Towards  Automated  De¬ 
tection  of  Buffer  Overrun  Vulnerabilities,”  Proc.  Net¬ 
work  and  Distributed  Systems  Security  Symposium 
(NDSS  2000),  Internet  Soc.,  Reston,  Va.,  2000,  pp. 
3-18. 

3.  G.  McGraw  and  E.  Felten,  Securing  Java:  Getting 
Down  to  Business  with  Mobile  Code ,  John  Wiley  & 
Sons,  New  York,  1 999;  complete  Web  edition  at 
www.securingj  ava  .com. 

4.  J.H.  Salzter  and  M.D.  Schroeder,  “The  Protection  of  In¬ 
formation  in  Computer  Systems,”  Proc.  IEEE,  IEEE 
Press,  Piscataway,  N.J.,  Vol.  9,  No.  63,  1975,  pp. 
1278-1308. 

5.  R.  Wahbe  et  al.,  “Efficient  Software-Based  Fault  Isola¬ 
tion,”  Proc.  14th  ACM  Symp.  Operating  System  Princi¬ 
ples  (SOSP),  ACM  Press,  New  York,  1993,  pp. 

203-216. 

6.  F.  Schneider,  “Enforceable  Security  Policies,”  ACM 
Trans.  Information  and  System  Security,  Vol.  2,  No.  4, 
Mar  2000. 

7.  U.  Erlingsson  and  F.B.  Schneider,  “IRM  Enforcement  of 
Java  Stack  Inspection,”  IEEE  Symp.  Security  and  Pri¬ 
vacy,  IEEE  Press,  Piscataway,  N.J.,  2000. 

8.  D.  Evans  and  A.  Twyman,  “Policy-Directed  Code 
Safety,”  Proc.  IEEE  Symp.  Security  an  Privacy,  IEEE 
Press,  Piscataway,  N.J.,  1999;  see  also  www.cs.virginia. 
edu/~evans. 

9.  U.  Erlingsson,  U.  and  F.B.  Schneider,  “SASI  Enforce¬ 
ment  of  Security  Policies:  A  Retrospective,”  Proc.  New 
Security  Paradigms  Workshop,  ACM  Press,  New  York, 
1999,  pp.  246-255. 

10.  A.C.  Myers,  “JFlow:  Practical  Mostly  Static  Informa¬ 
tion  Flow  Control,”  Proc.  26th  A  CM  Symp.  Principles 
of  Programming  Languages  (POPL),  ACM  Press,  New 
York,  1999,  pp.  228-241. 

11.  H.  Xi  and  F.  Pfenning,  “Dependent  Types  in  Practical 
Programming,”  Proc.  26th  ACM  Symp .  Principles  of 
Programming  Languages  (POPL),  ACM  Press,  New 
York,  1999,  pp.  214-227. 

12.  G.  Morrisett  et  al.,  “From  System-F  to  Typed  Assembly 
Language,”  ACM  Trans.  Programming  Languages  and 
Systems,  Vol.,  21,  No.  3,  May  1999,  pp.  528-569; 
www.cs.cornell.edu/talc. 

13.  G.C.  Necula,  “Proof-Carrying  Code,”  Proc.  24th  ACM 
Symp.  Principles  of  Programming  Languages  (POPL), 
ACM  Press,  New  York,  1997,  pp.  106-119;  www- 
nt.cs.Berkeley.edu/home/necula/public_html/pcc.html. 

14.  P.  Kocher,  J.  Jaffee,  and  B.  Jun,  “Differential  Power 
Analysis:  Leaking  Secrets ,n Advances  in  Cryptology - 
CRYPTO’99. .  In  M.  Weiner,  ed..  Lecture  Notes  in  Com¬ 
puter  Science,  Vol.  1666,  Springer,  New  York,  Aug.  1999, 
pp.  388-397;  www.cryptography.com/dpa/Dpa.pdf. 


September/Octobsr  2000  IEEE  SOFTWARE 


9 


{final  submitted  manuscript} 

Building  Secure  Software 

How  to  avoid  security  problems  the  right  way 

By  John  Viega  and  Gary  McGraw 
All  comments  to  yiega@list.org  and  gem@cigital.com 


©  200 1  John  Viega  and  Gary  McGraw 


■S9 

•a  °  ff 

T3 

P 

4 

0> 

Tl 

.SSg 

“85 

-rt  a 

_0> 

S  S  aa 

-III 

8  w»'g  ts 


o 

fl> 

CO 

<D 

ro 


o 

CO 

o 

c 

o 

o 

"D 

O 


k. 

0) 

a 

m 

O 


J  *S 

■8  || 

|5I 

k  P 

•l 

-§! 

§,§> 

gg, 

$  -a 

-5  » 
af 
§  s> 

§  s 

O  s 
■S  « 

Jl 

IS 

-SJ  *cs 

If 


I 

H 


ra  ^  H  CL 

t;  s  r 

o  &  g  .5 

P<^i  CO  r— 

HO  «J 
-§  §  &  8 


l?IS“oJ^_S^a 

tfe  .E  J2  t>  .2  .  03  g  i) 

.2  8-g 

g|l 8  r  g  a  2 

1 1*  s\*i  m§ 

FfrlfiStf »  8* 

>  >  jo  I  +-1  ,  ^ 

ijgjlll*! 
iifii.s  n  § 

g  o  53ih  »  ffi 
•Sjg  S3  "  8>f  «  d--2 

at ||;g  si  §s 

BirtG£o“Tt^« 

S  «3  8  &  °  §1  «  S 
§  e  °  2  «  |  ^ 

III  fill II 

ililil^s 

+-»C*5J2  4)SHid  0  a  <o 

^p§§|4| 

o  ^  u  g  u  ° 

S'  ^  g’il  dj)  ^  £  u 

4>l^!g-§H 

l^  f-i  5-s 


iM^ 

a  -3  -s  -a  | 

03  ^  o  l)  ol 

i  * 

s ?  ^ «  .§* 

«  s  S  2  O 


^  B  .g  m 

lain 


§  2  S  Jl  2 

«lll4li;i! 


| 

Jill 


'y  sg8 

J-t  «  <n 

cl,  rd  o  S 


&  >  1  s  _ 

.fjj  i  *  a 

3  |'n:IM 
a  a  s-gH 

O  *■*  O 

P5  J  Mg' 
eg  b-<  «b  ^  a) 

a:  3 

^  4)  2  “  ,H 

£  g®  §  | 

sljil 

>,  «  ca  'O  o 
S?  u  O  +-  «« 

1?  ^  *0*  G  ,2} 
'*-’  co  P<  — ,  ,o 

*?.«  2  'i  § 
3  *r  S  £  * 

2  1^ 

|  f  §  4 1 

pL*§  £  sa  & 

8  •?  !  J  g 

•a|^s<§ 
1 1  §  |  ” 

§  a  ia  1°  § 

££  „  -a.  | 

8  "S  J  ®  to 

b§  ^  ~  iS*  o 

a  jl  ii 


©  200  J  John  Viega  and  Gary  McGraw  0  2001  John  Viega  and  Gary  McGraw 


I  I 


I 


«r 

B 


>>  m 
^  u 

M 

If 

J 


|; 

« < 


if 


c 

2 


'•g  ‘3  £ 


*•8 
to  g 


l|-°a 

<  o  t3  -q 


§  ^  I  * 

l^l§ 

a  §3  §  S3 

c3  U  5  V 


.2 


0)  t2  ■ 


5>  ^  c3 
C  S  o 
«  *o  2  •° 
TE  »=<  «j  0) 

<0 

fl)  <o  <£  o 

Q  £  o  « 


o  U  (“  A  Q 
N  o^|, 

$5  tj  j&  ■&  IT 
jd  ™  ^2  S3  .2 
£  dS^.^ 
u  Sj  -g  .75 

8  *P  ®*  m  'S 

s  g  i!  | 

jsi  s  i| 

H  ^  ’S  6  > 


o  c«  U 


O  200 1  John  Vicga  and  Gary  McGraw  ©  2001  John  Viega  and  Gary  McGraw 


ill 


J^IJS  1 1 

1 1  list 


«(2  o  £  o  g  o  o 
g«  .  <u  £  5 


-W!  s  |«.a 

I'll  «.s  B.l| 

ls#o53p 

8~  r:  “  a -a  * 

&|  ^  J  B|  gl 

5 l 

I* 


8 


8  ||  §  (3  p 

i°  &fc"l 

.|is|si! 
•tfiifr? 

•O  m  -d  h  J  fl 

Si  a  .|-o  g, 

B-i  a  B  S  1  -a  F 

.S&PR-Sj-siSq 


w  q  s  h  o  q  6 

If  llllti 

3*  g*'S 


1 1  -3  « 

.-^\C5  O  “  ‘  °  - 

|U  I  2  - 

<3  c§  .2  §i.2  o  q  c 

Sill 1 f«il 

(3g&8^e<Esr§> 


bi>  ^  X  O 

■itf lala 

£  a  s  e- 


M  W1  (J 

a  2  « 

«J  .e 

■8  S| 

I  as 

-  —  ID 
-O 

&8  8 

«  U  *. 

^  O 

111 

.a-S-S 

jfl 

111 

I 

H  2 

q  »q  d 
u  o  ^ 
"  2  ^ 
ai  *55  «t 
*s  a  td 

5  5)** 
|  $  8 
ill 

q  o 
■£  d  » 
^  So 
<3  S3 

ill 

s  g  f 

q  u  b 

|  |  o 
13  g  ^ 


ill 

*  !►  <g 

<  2  a 


worse  yet  accidentally  propagate  a  virus  by  installing  new  programs  or  software  updates. 


i 

u 

ll 
1 1 

I  2 

Jh  o 

•?  § 

II 

Ph  o 

0>  .2 

<u  u 


3  w 

CO  (D 

«  I 

S  « 


.a  a 


0  2001  John  Vicga  and  Gary  McGraw  ©  2001  John  Vicga  and  Gary  McGraw 


d  N  ^ 

3  3  £ 

m  ffl  ,_;o 

U  05  ON  9 

is  ‘C  O'  M 
5  Son  ^ 

|  «  ^iS 
§£§  1 
§  3i2£ 
J3  g.“  8 
§1  «-g 

Sj*  a 

8  *r  i  » . 
s>  g>  i  •§ 

2  § 

gS  l-i 

a  gCJ  g 

I  li-B 

siii 

^-li 

•  *>  U  tf  rl 


|S  J  J 

o  ao  ^ 

llf  E' 

®  5  M  U 

cd  aj  « 

Stf5«c- 

_  fa  di  O'-  « 

I  i  i  Si  •£ 

el  °  !  a 

|I  ft  § 
I-g  t  ^  j 

C  »~3  to  ’33 

"I  I  2  g 

°  .S  g  J  8 
1 1 
i  §i  s  § 

ips* 

“‘M-Sl 

o  a.a  .*  £ 

il  gga 
a.s  IA& 

•s  3  s  s  1 

gill! 

s  B  Si  33  .& 

£  I  8  *2  S1 

^  1  A 
o'g  «  S  c 

-Ss-h  j  ^ 

c  O  u  S3 

I  S’-t-8  I 

1=3  §~  * 
a  S 
B  5 


•6  T3 

s  a 


^  r/  E2  <o  *t3 
P  ®  o>  ^£j  Cj 

f  8  P  *  g 

"•filjil. 

§  u  ^  O  g 

Mi:a«: 

8  I  8  2  -a  I 
3  g  .a  J5  'B  I 

«  J«  *|  |: 

•sl^l  8.1  ’ 
•|  g  I  e  ”1 1  ‘ 
I  SeS-gJ 

to  ~  K  3  ag 
to  0  G  c  g  «  - 

g£i  !  .3  | 

to  'G  Sj'  -e*  Jn*  JB 

I'll  §4  o’ 

•g  I's  §  §  tr 

J-Bjjl 

liPil 

S  ^  rS  to  .fc*  P  • 

8  X  &  o  S  -s  . 

§  1 1  s  §  1 

«  s  ^  ‘s  -S  ® . 

8  ^  ~  42  *§>  i  • 
g  a  8  8  §  g 

g  -g  3  p.  «  ^  . 

•®  E  o  S  S  2 
j>  §•  8  |  3  8 


-a  4J 


w  a 
co  «& 


II 


©  2001  John  Vicga  and  Gary  McGraw  ©  2001  John  Viega  and  Gary  McGraw 


Penetrate  and  Patch  is  Bad 

Many  well-known  software  vendors  don’t  yet  understand  that  security  is  not  an  add-on 
feature.  They  continue  to  design  and  create  products  at  alarming  rates,  with  little  attention 
paid  to  security.  They  start  to  worry  about  security  only  after  their  product  has  been 


I  p  I 

m  -  o  s? 
“  B  •§  S 

_  s  ta>5 

H  s*  a 


t 


©  2001  John  Viega  and  Gary  McGraw  ©  2001  John  Viega  and  Gary  McGraw 


<3 

43  x 


b2.HU 
g  |  2  W -a 

8  8  g,”  s 
£  g  §  S 
|ja  g  ®  * 

^  a  w  2  g 
5  '•£  «  *2 

•S  |  gs- 

«  -s  ■g  a  2 
&-s  g  -  s 


■3*B 

a.  <l> 

•8*5 
o  a 


.a  tf 


II 

u  ^ 

u  .a 

rt  43 

*  J 


T3  g> 

gf 

*  1 

W 

®  § 

*  ^ 

I  if 

3  S  S 

*3  §  g 

•|>! 
|  3  8 
J  S'S 

*o  BJ  ts 
«  O  g. 


q  ^  4)  rg 

8  -22  i§  *«fi 


Jfl 
1  8  § 
a  8.2 
SI  g> 

Ijl 

w  *  3 


S  J  'I  to  -| 

sss^ 


£?s  .2  5! 

f  O  ^  3 

1  -S  £rS 
Salt 60 
§3  s 

I  -1 


©  200 1  John  Viega  and  Gary  McOraw  ©  200 1  John  Vicga  and  Gary  McGraw 


<3  /-N  &  *9  a 

I  5.S  s  o 


in  *2 


>cq-g? 
i  P  O  ° 


LJ  &Q  C  O  U 

'1  § 

ig  §  J 1 1 

&  s  .i «  j 

w  <R  4) 

p  O  p  T3  C 

*  O  £  o  CO 

>  E  %  S  % 


Jl 

•p 


.a 

■a 


™  ja  ^ 

-S3  f  S 


P  M  X) 

Jig 
212 
<g  3  8 

fe.g  1 
- 

111  . 
I  S3  &  J 
135  | 

JS*i  *  t 
auP 
►•Sfi  9 

a  3  -1  » 

o  g  •§  2 

s«1| 

| «  §  i 

I  Ilf 


•5  4i  ?  o  £  m 
>>§£  ”1-2 
g  «  rj  2  “» 

g  60~ 
iS-sg 

I  h 

f  if  | 
ill! 

Kill! 

o  42  §  -s  g  .a 


a 

cs  “  o 
O  *4—*  £3  w 

ii;w 

« g  *1 8  2 ji 

|S?8&.2 

§:3«i 

iiBi:ii|. 
ifsigllii 

o j§  §  slo^la 


ega- 


4=5  rt  «  _ 

■a -si  s  g 

!  a*  s| 

i  l-i  1 1 

§  2  -2  Q)  -t3 

i  *1*1 

<U  p  TTl  rt)  -. 


tliili. 

ra  '’o^s 

c 

■8  .&  i  ^  <  § 


h  cm 
J3  «j 


a  & 

•r2  wa 


__y*  W 

g  s 
s  a 

O  cl 


s 

s 

* 


O  2001  John  Viega  and  Gary  McGraw 


o  t)  O  .£2 

.S' is  -I 

H  at 

«  H  5  (3 

•-  S  : a 

a  « 
a  ^ 

to  •  s  S 

<U  ^  to  ro 

l-aj-a 

v*l! 

■Hal 

EQ  V  Jh  U 

§w  apa 

g  <s  H 
«§»  . 
8  e-l| 

g|wJ 

*  “  •I' s 

-*  *3  & 


o  g  & 

I?  ft  ci 


8  &1|!§ 
SUM 


o  >  o  fi 

£JS  J  & 


©  2001  John  Vicga  and  Gary  McGraw  ©  2001  John  Vicga  and  Gary  McGraw 


ea  O  *0  ' 

S  »  o  -g 


•s  -g  -S  2" 


S{”  B' 
||§8 

pP' 

psi' 

to  w  °*  a 
cd  ’S  g  -  a  ' 

f  S  1*5 

pp- 


H  !"  S  g 

t!  «  X  5 

cd  C  o  w  . 

b-3  S  “1 . 

VJl 

a  S  | 

3  «  S>  $3 
9E  b  S' 


o  s  s,  a 

f«|i 

ill! 


§  42  w 

E  £  ^  g 


I  til 

•a  ?ft! 

s  M  «  a 

5  7?  u  «d 
ij  ^-g.S 

1  *s  “  a 

§  g  5  o 
M  fl  ^  •£  bo 

*2a  e  a 

■g*  ?ii 

J  -s  '€  S  | 

“  |  ri  o  ° 
cd  £  C  O  w 

ffl.i! 

2  60 
S  ?  o  * 


b£« 

8 '%  & 

8  I 
I  P 

>•"1  g 

’g  t?  ^  'a 

&  o  &  tg 

pfl 

2»ls 

&«•*■§ 

U  £  JU  g 

BJ  J3  T3  "" - 

till 


a||  * 

^.a§-S 
3  a  ^  w 
v  S  ss  8 

u  |  ” 

u  o  o  5 
(L  u  O 

^  to  +- 

tl  B-t 
2.S  §:§ 

5  8  s  s 


£  £  o  jfc 

•S  a  2  T3 
§•8  S 
6b  -  S  « 

Q  S  a  i 
cu«a  .§  u 


!  1  a  * 


w  g  II 


Is! 

®  i  •s 


i!l 
-  &  * 
Q>  ^  «  J> 

g  *  g  “ 

|  2  ~  I 

O  |  2-0 
*o  o  §  | 
*-  la  £  > 

g  §|*g 

•S  O  u  cd 

J8  Z  g  a 
O)  3  §  .5 

42  a  j  § 

<tt  £  -2 

A> 

■c  “  >  2 


i-g  |§| i 
gS-S  §  | 

O  v  O  (I  5 

a  £  a  *- 
to  ^  -S  °  g 
3  c  2  -a 

g  »  C  J3 

l  -3  §  «  ^  3 

j  g  >*  jj  -g  ^ 

i-s  §  1 1 s 

!  Si  8.1  g 

!  §  8  I  f  I 


i  S  o^g  g 

s  S  a  $  |  a> 

f|  <2 .2 1 

UlfSt 

3  o  -H  ° 

M  n  U  .  .  - 

i!g-sp 

a-3S  s~-d 

i  •*  1  d  ■§  1  § 

1  ’|,8’|  g 

!  I  u  1 1  -  jg 

>  g  *  o « ■?  § 

s  a  ^  ^  w  o 
j  (vi  c  -n  S  ^  e 

!  °  s  SI-s-S 

I  u  S  '  S  G  M  s 

!  -|  S  f  a  ^  « 
■»  I  g  ^  >>2  « 

1  H  Ji  u  -3  S  w 

!  CL-  c  CO  S  w  7? 

!  g  .o>  c  5  u  g- 
i  o  7~i  2  <u 

:  o  U  o  >  H  &, 


a  s  b  i&  f 

•Sljggf 

Mflil 

J  &w  M 

6  »o  3  /— v  a 

O  -a  c  60  CJ  O 

& d  a  2  a! 

|||^=3  | 

8  a  o  ts  g  *§ 

cd  o  «  .2,3-9 

§  2  T3  O 

S  ft?  u  ^ 

g  43  5  -9  § 
”  §  ^  8  -2  ® 
§  I  §  cd  *o  o 

5  si 

j|!|i| 

3 II a  s ! 


Ill  I 


II  . 
itl 

X  >  x> 

^  Q  O 

a  ^  iL 

I'g  3 


fill 

Q.  co  -rt  O 

S  T>  1  O 


|1  ia.fl! 

s  s  §ii 

O  g  J?  11^. 


1  i-1 1 

a  I  |  I 

3  S,S  a 

jStl 

HI 

a  p,  cd 


I  °  2 
S  §■<  5 

ll|l 

s»g 

•111 


^  a  g  -as 

Cd  O  £  -rt 
>  ^  •- 
a)  *3  •a 

3  .’g  I 

o  '  5r* 

I  PI 

s  8-1  I 


S  -9  J  ! 
&!&; 
•8  g1^  < 

l|^ 


as  » 

o  S  §  .9  j 
"O  M  “  S 

iliii 

W  cd  *o  H  a 


when  efficiency  isn’t  needed).  Efficiency  can  be  an  important  goal,  though  it 
usually  trades  off  against  simplicity.  Security  often  comes  with  significant 
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Software  Risk  Management 

for  Security 

Gary  McGraw  Ph.D. 
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COMPUTER  SECURITY  is  taking  on  new  importance  as  electronic  commence  metamor¬ 
phoses  from  hype  to  reality.  Large  and  small  businesses  alike  are  reinventing  themselves  as  e-com¬ 
merce  players.  The  implications  for  computer  security  practice  are  immense.  When  bits  count  as 
money,  protecting  bits  becomes  as  important  as  any  other  aspect  of  running  a  successful  business. 

One  essential  element  shared  by  every  modern  information  system  is  the  software  that  deter¬ 
mines  how  the  system  behaves.  Today's  software  problems  lead  to  spectacular  real  world  failures 
of  many  different  kinds ,  including  security  problems,  reliability  problems,  and  safety  problems.  It  is 
probably  only  a  matter  of  time  before  software  causes  the  demise  of  a  large  company. 

What  can  we  do  to  combat  software  bugs  lying  at  the  root  of  these  problems,  especially  in  light  of 
the  rush  to  embrace  e-commerce  and  the  intense  pressure  of  Internet  time ?  How  can  we  avoid 
treating  security  as  an  add-on  feature,  when,  like  dependability,  security  is  really  a  property  of  a 
complete  system?  This  column  discusses  an  approach  to  security  analysis  that  we  have  applied 
successfully  over  the  last  several  years  at  Cigital.  Our  approach  is  no  magic  bullet  but  it  offers  a 
reasoned  methodology  that  has  proven  to  be  useful  in  the  trenches. 

A  View  From  50,000  Feet 

OUR  METHODOLOGY,  like  many  useful  things,  is  a  mix  of  art  and  engineering.The  idea  is 
straightforward:  Design  a  system  with  security  in  mind,  analyze  the  system  in  light  of  known 
and  anticipated  risks,  rank  the  risks  according  to  their  severity,  test  to  the  risks,  and  cycle  bro¬ 
ken  systems  back  through  the  design  process. 

The  process  outlined  above  has  one  essential  underlying  goal:  avoiding  the  unfortunately  per¬ 
vasive  penetrate-and-patch  approach  to  computer  security — that  is,  avoiding  the  problem  of 
desperately  trying  to  come  up  with  a  fix  to  a  problem  that  is  being  actively  exploited  by 
attackers.  In  simple  economic  terms,  finding  and  removing  bugs  in  a  software  system  before, 
its  release  is  orders  of  magnitude  cheaper  and  more  effective  that  trying  to  fix  systems  after 
release. The  many  problems  inherent  in  the  penetrate-and-patch  approach  have  been  dis¬ 
cussed  elsewhere,  but  the  main  concerns  are: 

•  Patches  are  rarely  applied  by  overworked  system  administrators 
who  would  rather  not  tweak  their  functioning  systems; 

•  patches  are  rushed  out  under  immense  pressure  from  the  market 
and  often  introduce  new  problems  of  their  own;  and 

•  patches  are  superficial  solutions  that  do  not  get  at  the  heart  of 
software  problems. 
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Designing  a  system  for  security,  and  testing  the  system  extensively  before  release,  presents  a 
much  better  alternative. 

Risk  Assessment 

There  IS  NO  SUCH  THING  AS  I  00  PERCENT  SECURITY.  In  fact,  there  is  a  fundamental 
tension  inherent  in  today’s  technology  between  functionality  (an  essential  property  of  any 
working  system)  and  security  (also  essential  in  many  cases).  A  common  joke  goes  that  the 
most  secure  computer  in  the  world  is  one  that  has  its  disk  wiped,  is  turned  off,  and  is  buried 
in  a  ten  foot  hole  filled  with  concrete.  Of  course,  a  machine  that  secure  also  turns  out  to  be 
useless.  In  the  end,  the  security  question  boils  down  to  how  much  risk  a  given  enterprise  is 
willing  to  take  on  in  order  to  solve  the  problem  at  hand  effectively.  Security  is  really  a  ques¬ 
tion  of  risk  management. 

The  key  to  an  effective  risk  assessment  is  expert  knowledge  of  security.  Being  able  to  recog¬ 
nize  situations  where  common  attacks  can  be  applied  is  half  the  battle.The  first  step  in  any 
analysis  is  recognizing  the  risks.This  step  is  most  effectively  applied  to  a  system’s  specification. 
Once  risks  have  been  identified,  the  next  step  is  ranking  the  risks  in  order  of  severity.  Any 
such  ranking  is  a  context-sensitive  undertaking  that  depends  on  the  needs  and  goals  of  the 
system  at  hand.  Some  risks  may  not  be  worth  mitigating,  depending,  for  example,  on  how 
expensive  carrying  out  a  successful  attack  might  be.  Ranking  risks  is  essential  to  allocating 
testing  and  analysis  resources  further  down  the  line.  Since  resource  allocation  is  a  business 
problem,  making  good  business  decisions  regarding  such  allocation  requires  sound  data. 

Given  a  ranked  set  of  potential  risks  in  a  system,  testing  for  security  is  possible.Testing 
requires  a  live  system  and  is  an  empirical  activity  requiring  close  observation  of  the  system 
under  test.  Security  tests  often  do  not  result  in  clear-cut  results  like  obvious  system  penetra¬ 
tions,  though  sometimes  they  do.  More  often  a  system  will  behave  in  a  strange  or  curious 
fashion  that  tips  off  an  analyst  that  something  interesting  is  afoot. These  sorts  of  hunches  can 
be  further  explored. 

Sound  Software  Engineering 

One  premise  of  our  methodology  is  the  use  of  sound  software  engineer- 

ING  PRACTICES.  Process  is  a  good  thing,  in  moderation.  Any  system  that  is  designed  accord¬ 
ing  to  well-understood  requirements  will  be  better  than  a  system  thrown  together  arbitrarily. 
An  apt  example  of  a  security-related  requirement  is  a  requirement  that  states  that  some 
data  must  be  protected  against  eavesdropping  since  it  is  particularly  sensitive  information. 
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From  a  set  of  requirements,  a  system  specification  can  be  created.The  importance  of  solid 
system  specification  cannot  be  overemphasized.  After  all,  without  a  specification,  a  system  can¬ 
not  be  wrong,  it  can  only  be  surprising!  And  when  it  comes  to  running  a  business,  security 
surprises  are  not  something  we  want 

A  solid  specification  draws  a  coherent  big-picture  view  of  what  the  system  does  and  why  the 
system  does  it.  Specifications  should  be  as  formal  as  possible,  without  becoming  overly  arcane. 
Formality  is  extremely  powerful,  but  it  too  is  no  silver  bullet.  Remember  that  the  essential  rai¬ 
son  d’etre  for  a  specification  is  understanding. The  clearer  and  easier  to  understand  a  specifi¬ 
cation  is,  the  better  the  resulting  system  will  be. 

The  Importance  of  External  Analysis 

Nobody  DESIGNS  OR  DEVELOPS  SYSTEMS  POORLY  ON  PURPOSE.  Developers  are  a 
proud  lot,  and  for  the  most  part  they  work  hard  to  create  solid  working  systems.This  is  pre¬ 
cisely  why  a  security  risk  analysis  team  should  not  include  anyone  from  the  design  and  devel¬ 
opment  team.  One  essential  way  in  which  security  testing  differs  from  standard  testing  is  in 
the  importance  of  preserving  a  completely  independent  view  of  the  system,  divorced  from 
design  influences.  In  general  testing,  one  person  can  play  dual  roles;  a  design  team  testing 
expert  to  improve  testability  early  on,  and  an  independent  tester  later  in  the  process.  In  secu¬ 
rity  testing  there  is  a  much  greater  risk  of  tunnel  vision. 

Putting  together  an  external  team  is  important  for  two  main  reasons: 

•  To  avoid  tunnel  vision.  Designers  and  developers  are  often  too  close  to 
their  systems  and  are  skeptical  that  their  system  may  have  flaws. 

•  To  validate  design  document  integrity. The  requirements  and  specifications 
that  the  designers  and  developers  use  should  be  clear  enough  that  an 
external  team  can  completely  understand  the  system. 

Another  reason  warranting  the  use  of  externa!  teams  is  the  expertise  issue.  Being  an  excel¬ 
lent  programmer  and  understanding  security  problems  are  not  the  same  thing. 

The  good  news  is  that  an  external  team  need  not  be  made  up  of  high-priced  external 
experts.  Often  it  is  good  enough  to  have  a  team  from  your  own  organization  made  up  of 
security  experts  who  were  not  involved  in  design  decisions.The  bad  news  is  that  security 
expertise  seems  to  be  a  rare  commodity  these  days.  Determining  whether  or  not  to  seek 
help  outside  of  your  organization  will  depend  on  what  the  system  you  are  designing  is  meant 

©  1999  IEEE.  Reprinted,  with  permission,  from  IEEE  Computer,  32(4):  103-105,  April  1999. 


to  do  and  what  happens  if  it  is  broken  by  attackers.  If  you  are  betting  your  business  on  a 
piece  of  code,  it  had  better  not  fail  unexpectedly 

An  experienced  team  of  external  analysts  considers  myriad  scenarios  during  the  course  of  an 
analysis.  Examples  of  scenarios  include  decompilation  risks  in  mobile  code  systems,  eaves¬ 
dropping  attacks,  playback  attacks,  and  denial  of  service  attacks. Testing  is  most  effective  when 
it  is  directed  instead  of  random.The  upshot  is  that  scenarios  can  lead  directly  to  very  rele¬ 
vant  security  tests. 

Security  Guidelines 

SECURITY  IS  RISK  MANAGEMENT.  The  risks  to  be  managed  take  on  different  levels  of 
urgency  and  importance  in  different  situations.  For  example,  denial  of  service  may  not  be  of 
major  concern  for  a  client  machine,  but  denial  of  service  on  a  commercial  Web  server  could 
be  disastrous.  Given  the  context-sensitive  nature  of  risks,  how  can  we  compare  and  contrast 
different  systems  in  terms  of  security? 

Unfortunately,  there  is  no  golden  metric.  But  we  have  found  in  practice  that  the  use  of  a 
standardized  set  of  security  analysis  guidelines  is  very  useful.  Our  most  successful  client  makes 
excellent  use  of  security  guidelines  in  their  risk  management  and  security  group. 

The  most  important  feature  of  any  set  of  guidelines  is  that  they  create  a  framework  for  con¬ 
sistency  of  analysis.  Such  a  framework  allows  any  number  of  systems  to  be  compared  and 
contrasted  in  interesting  ways. 

Guidelines  consist  of  both  an  explanation  of  how  to  do  a  security  analysis  in  general,  and 
what  kinds  of  risks  to  consider  No  such  list  can  be  absolute  or  complete,  but  common  crite¬ 
ria  for  analysis — such  as  the  Department  of  Defense's  Trusted  Computing  Systems  Evaluation 
Criteria,  commonly  called  the  Orange  Book— can  be  of  help. 

Much  of  todays  software  is  developed  incredibly  quickly  under  immense  market  pressure. 
Internet  time  now  rivals  dog  years  in  duration,  approaching  a  7: 1  ratio  with  regular  time. 
Often  the  first  thing  to  go  under  pressure  from  the  market  is  software  quality  (of  any  sort). 
Security  is  an  afterthought  at  best,  and  is  often  forgotten. 
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A  Common  Mistake 

Bolting  SECURITY  ONTO  AN  EXISTING  SYSTEM  IS  SIMPLY  a  BAD  IDEA.  Security  is  not  a 
simple  feature  you  can  add  to  a  system  at  any  time.  Security  is  like  fault  tolerance,  a  system- 
wide  emergent  property  that  requires  much  advance  planning  and  careful  design. 

We  have  come  across  many  real-world  systems  (designed  for  use  over  protected  proprietary 
networks)  that  were  being  reworked  for  use  over  the  internet.  In  every  one  of  these  cases, 
Internet-specific  risks  caused  the  systems  to  lose  all  their  security  properties.  Some  people 
refer  to  this  problem  as  an  environment  problem,  where  a  system  that  is  secure  enough  in 
one  environment  is  completely  insecure  when  placed  in  another.  As  the  world  becomes 
more  interconnected  via  the  Internet,  the  environment  most  machines  find  themselves  in  is 
at  times  less  than  friendly. 

It  is  always  better  to  design  for  security  from  scratch  than  to  try  to  add  security  to  an  exist¬ 
ing  design.  Reuse  is  an  admirable  goal,  but  the  environment  in  which  a  system  will  be  used  is 
so  integral  to  security  that  any  change  of  environment  is  likely  to  cause  all  sorts  of  trouble — 
so  much  trouble  that  well-tested  and  well-understood  things  fall  to  pieces. 

Security  Testing  Versus  Functional  Testing 

FUNCTIONAL  TESTING  DYNAMICALLY  probes  a  system  to  determine  whether  the  system 
does  what  it  is  supposed  to  do  under  normal  circumstances.  Security  testing  is  different 
Security  testing  probes  a  system  in  ways  that  an  attacker  might  probe  it,  looking  for  weak¬ 
nesses  to  exploit.  In  this  sense,  advanced  testing  methodologies  like  software  fault  injection 
can  be  used  to  probe  security  properties  of  systems.  An  active  attack  is  an  anomalous  cir¬ 
cumstance  that  not  many  designers  consider 

Security  testing  is  most  effective  when  it  is  directed  by  system  risks  that  are  unearthed  during 
a  risk  analysis. This  implies  that  security  testing  is  a  fundamentally  creative  form  of  testing  that 
is  only  as  strong  as  the  risk  analysis  it  is  based  on.  Security  testing  is  by  its  nature  bounded  by 
identified  risks  (and  the  security  expertise  of  the  tester). 

Code  coverage  has  been  shown  to  be  a  good  metric  for  understanding  how  good  a  particu¬ 
lar  set  of  tests  is  at  uncovering  faults.  It  is  always  a  good  idea  to  use  code  coverage  as  a  met¬ 
ric  for  measuring  the  effectiveness  of  functional  testing.  In  terms  of  security  testing,  code  cov¬ 
erage  plays  an  even  more  critical  role.  Simply  put,  if  there  are  areas  of  a  program  that  have 
never  been  exercised  during  testing  (either  functional  or  security),  these  areas  should  be 
immediately  suspect  in  terms  of  security.  One  obvious  risk  is  that  unexercised  code  will 
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include  trojan-horse  functionality  whereby  seemingly  innocuous  code  carries  out  an  attack. 
Less  obvious  (but  more  pervasive)  is  the  risk  that  unexercised  code  has  serious  bugs  that 
can  be  leveraged  into  a  successful  attack. 

Dynamic  security  testing  can  help  ensure  that  such  risks  don’t  come  back  to  bite  you.  Static 
analysis  is  useful  as  well.  Many  of  today’s  security  problems  are  echoes  of  well-understood 
old  problems  that  can  come  around  again.The  fact  that  80  percent  of  1 998’s  CERT  alerts 
involved  buffer  overflow  problems  emphasizes  the  point. There  is  no  reason  that  any  code 
today  should  be  susceptible  to  buffer  overflow  problems,  yet  they  remain  the  biggest  source- 
code  security  risk  today. 

It  is  possible  to  scan  security-critical  source  code  for  known  problems,  fixing  any  problems 
encountered.  Current  research  is  exploring  the  utility  of  static  source-code  scanning. The  key 
to  any  such  approach  is  a  deep  knowledge  of  potential  problems. 

There  are  many  areas  today  in  which  software  must  behave  itself  Good  software  assurance 
practices  can  help  ensure  that  software  behaves  properly.  Safety-critical  and  high  assurance 
systems  have  always  taken  great  pains  to  analyze  and  track  software  behavior  With  software 
finding  its  way  into  every  aspect  of  our  lives,  the  importance  of  software  assurance  will  only 
grow.  We  can  avoid  the  band-aid-like  penetrate-and-patch  approach  to  security  only  by  con¬ 
sidering  security  as  a  crucial  system  property  and  not  as  a  simple  add-on  feature. 

Computer  security  is  becoming  more  important  because  the  world  is  becoming  highly  inter¬ 
connected  and  the  network  is  being  used  to  carry  out  critical  transactions. The  environment 
that  machines  must  survive  in  has  changed  radically.  Deciding  to  connect  a  LAN  to  the 
Internet  is  a  security-critical  decision.The  root  of  most  security  problems  is  software  that  fails 
in  unexpected  ways. Though  software  assurance  has  much  maturing  to  do,  it  has  much  to 
offer  to  those  practitioners  interested  in  striking  at  the  heart  of  security  problems. 
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